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ABSTRACT 


The  purpose  of  this  thesis  is  to  investigate  two  classes  of 
problems  with  reference  to  binomial  sampling  plans.  Firstly,  we  make  a 
detailed  examination  of  simple  binomial  sampling  plans  with  two  parameters 
from  the  point  of  view  of  estimation  theory,  studying,  in  particular,  the 
completeness  of  such  sampling  plans.  The  second  problem  with  which  we  are 
concerned  is  the  role  of  simplicity  in  the  optimal  truncated  sequential 
plans  described  by  Blackwell  and  Girshick.  In  this  connection,  we  study 
the  relationship  of  binomial  sampling  plans  as  described  by  Girshick, 
Mosteller  and  Savage  and  the  "optimal  plans"  described  by  Blackwell  and 
Girshick. 

Chapter  I  presents  a  resume  of  the  relevant  work  on  binomial 
sampling  plans.  We  reformulate  several  of  the  results  and  present  a 
different  approach  and  some  alternative  definitions  to  bring  out  more 
clearly  the  fundamental  problems  of  unbiased  sequential  estimation  for 
two  parameter  binomial  populations.  We  illustrate  these  results  by 
proving  the  completeness  of  two  families  of  two  parameter  binomial 
distributions  and  list  explicitly  the  polynomials  estimable  unbiasedly 
in  these  plans.  The  optimal  sampling  plans  of  Blackwell  and  Girshick 
are  interpreted  in  terms  of  the  more  intuitive  characterization  of  binomial 
sampling  plans  first  given  by  Girshick,  Mosteller  and  Savage  as  a  preliminary 
to  our  discussion  in  Chapter  III,  which  enables  us  to  decide  when  an  optimal 
plan  is  simple. 

In  Chapter  II,  we  prove  the  completeness  of  another  family  of 
two  parameter  binomial  distributions.  Using  the  concepts  and  alternative 


. 


definitions  given  in  Chapter  I,  we  are  able  to  give  the  number  of  poly¬ 
nomials  in  the  basis  of  the  linear  space  of  estimable  polynomials  for 
large  classes  of  simple,  sampling  plans  and  an  upper  bound  for  this  number 
for  any  simple,  sampling  plan.  Certain  topics  and  theorems  which  have 
been  touched  upon  but  not  fully  discussed  in  the  literature  are  brought 
out  as  a  natural  consequence  of  our  discussion. 

The  relationship  between  simplicity  and  the  risk  functions  of 
optimal,  sampling  plans  is  discussed  in  Chapter  III,  By  means  of  an 
algorithm,  we  give  a  method  for  discerning  readily  whether  the  sampling 
plans  described  by  the  cylinder  sets  of  Blackwell  and  Girshick  are  simple 
or  not.  Sufficient  conditions  for  certain  optimal  sampling  plans  to  be 
simple  are  given,  and  illustrated  by  several  examples. 
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CHAPTER  I 


SIMPLE  SAMPLING  PLANS  AND  SOME  OF  THEIR  PROPERTIES 

1 . 1  Introduction 

Binomial  sampling  plans  and  problems  of  estimation  and  complete¬ 
ness  for  these  plans  were  first  studied  by  Girshick,  Mosteller  and  Savage 
[9]  in  1946.  They  described  these  plans  in  terms  of  boundary,  continuation 
and  inaccessible  points  and  introduced  the  concept  of  simplicity.  Further 
work  on  simple  sampling  plans  was  carried  out  by  De  Groot  [6]  who  studied 

"  j. 

unbiased  sequential  estimation  in  these  plans  and  gave  an  elegant 
characterization  of  simple  sampling  plans  of  size  n.  Other  characterizations 
of  such  plans  in  terms  of  vectors  with  non-negative  integral  components 
are  available  [(5),  (12)]  and  are  due  to  Narayana,  Brainerd  and  Mohanty. 

Simple  sampling  plans  with  two  parameters,  which  can  be  considered  as  an 
extension  of  the  basic  model  in  [9],  have  occasionally  been  considered  in 
the  literature  e.g.  Gabriel  [8]  and  Blackwell  and  Girshick  [4],  (cf.  p.  222). 
We  introduce  and  study  in  this  chapter  the  simplest  sampling  plans  of 
size  n  with  2  parameters  in  order  to  indicate  the  type  of  estimation 
problems  considered  in  this  thesis. 

From  a  very  different  point  of  view,  Blackwell  and  Girshick 
have  delineated  "optimal  sequential  sampling  plans"  in  their  discussion  of 
Bayes  procedures  for  sequential  games  ([4],  particularly  Chapters  9  and  10). 
While  it  is  easily  seen  that  their  very  general  definition  of  sequential 
sampling  plans  in  terms  of  cylinder  sets  includes  all  binomial  sampling 
plans  of  size  n,  we  propose  to  investigate  optimal  sequential  sampling 
plans  in  certain  special  cases  where  the  definitions  in  terms  of  cylinder 
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sets  and  in  terms  of  boundary,  continuation  and  inaccessible  points  are 
both  available.  With  this  end  in  view,  we  shall  present  a  brief  though 
somewhat  incomplete  review  of  the  results  of  Blackwell  and  Girshick.  It 
is  not  the  purpose  of  this  chapter  to  provide  a  detailed  resum^  of  all  the 
diverse  results  known  concerning  sampling  plans,  since  excellent  accounts 
are  available  in  the  references  quoted  above,  ([4],  [5],  [6],  [12]).  We 
shall  however,  present  a  review  of  certain  results  and  theorems  which  are 
directly  relevant  to  our  work,  and  especially  indicate  various  modification 
of  notations  and  results  which  will  prove  useful  in  later  chapters. 

Section  1.2  describes  the  simplest  sampling  plan  of  size  n  with 
2  parameters,  namely  the  "fixed"  sampling  plan  of  size  n.  The  next  two 
sections  discuss  various  estimation  and  completeness  properties  of  this 
plan  and  motivate  our  study  of  simple  sampling  plans  with  2  parameters 
in  Chapter  2  and  the  latter  part  of  this  thesis. 

Sections  1.5  and  1.6  give  a  brief  resume  of  known  results 
concerning  characterizations  of  simple  sampling  plans  and  introduces  the 
concept  of  "canonical  sequences  of  deformations",  following  the  develop¬ 
ment  of  Brain erd  and  Narayana  [5].  The  last  section  summarizes  the  pro¬ 
cedure  described  by  Blackwell  and  Girshick  [4]  for  obtaining  optimal 
sequential  sampling  plans  and  interprets  this  procedure  in  terms  of 
boundary,  continuation  and  inaccessible  points  in  the  many  cases  where 
this  is  possible. 

1.2  The  "Fixed"  Sampling  Plan  of  Size  n. 

In  order  to  motivate  our  study  of  two  parameter  binomial  sampling 
plans,  we  shall  introduce,  in  this  section  a  particularly  simple  plan  of 
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this  type,  namely  the  "fixed"  sampling  plan  of  size  n.  This  sampling 
plan  generates  a  distribution  which  we  shall  call  A^.  For  the  sake  of 
simplicity,  we  shall  use  the  symbol  to  denote  both  the  sampling  plan 

and  the  distribution  generated  by  this  sampling  plan.  This  distribution 
has  already  been  studied  by  Gabriel  [8]. 

All  the  plans  considered  in  this  thesis  can  be  derived  from  the 
following  model.  Successive  observations  are  made  on  chance  variables 

X, ,  X  ,  X  ,  ...  such  that 

12  5 

P(X1  =  1)  =  pr  P(XX  =  0)  =  1  -  Pl  =  qx  and 


1.2.1  P(X  =  1 

n 

|x 

1  n-1 

=  1)  =  P2, 

P(X 

n 

=  0|x  . 

1  n-1 

=  1)  =  1 

'  P2  ■  q2 

P(xn  =  1| 

lX  1 

1  n-1 

=  0)  = 

P(X 

n 

=  o|x 

1  n-1 

=  0)  =  1 

-  p:  =  v 

where  0  <  p^  <  1, 

0  < 

pp  <  1  and 

pi  / 

P2.  ^ 

Pl  =  P2> 

we  have  the 

usual  one  parameter  binomial  sampling  plan.  We  can  also  consider  this 

model  as  the  following  coin  tossing  problem.  Two  coins,  1  and  2,  are 

tossed  with  probabilities  p^  and  pp,  0  <  Px  <  1,  0  <  pp  <  1,  vl  /  p2> 

tVi 

of  falling  heads.  On  the  first  toss,  coin  1  is  used.  On  the  j  toss, 

coin  1  is  used  if  the  j-lst  toss  was  a  tail,  and  coin  2  is  used  if 

the  j-lst  toss  was  a  head.  By  specifying  various  stopping  rules,  we 
can  obtain  different  distributions  (sampling  plans).  We  shall  consider 
several  such  stopping  rules  in  the  next  sections  and  in  Chapter  II. 

Definition  1.2.1.  The  distribution  obtained  by  stopping  as  soon  as  n 

tosses  are  made  is  called  A  . 

n 

By  using  coin  2  at  the  first  toss  and  then  proceeding  as  above, 
we  obtain  the  distribution  which  we  call  A!.  It  is  easily  seen  that  the 
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distributions  A  and  A^  are  generated  by  the  ’’fixed"  sampling  plan 
of  size  n  under  the  model  1.2.1.  In  De  Groot's  notation  [6],  the  boundary, 
B,  for  A^  and  A^  consists  of  all  those  points  7  =  (x,y)  where  x 
and  y  are  non-negative  integers  such  that  x  +  y  =  n,  n  >  0. 


In  terms  of  the  model  1,2.1,  we  can  determine  the  probability 
of  reaching  any  point  7  =  (x,y).  In  the  usual  way,  we  can  associate  the 
sequence  of  l's  and  0's  with  a  lattice  path  in  the  plane,  where  each 

M 

1  is  represented  by  a  unit  horizontal  step  and  each  0  by  a  unit  vertical 
step.  Since  we  shall  be  dealing  with  such  lattice  paths  only,  we  shall 
simply  call  them  paths.  Let  k  be  the  number  of  right  angle  turns  of 
the  type  — I  (k  is  the  number  of  (1,0)  joins)  in  the  path  to  the  point 
/  =  (x,y).  The  probability  associated  with  this  path  is  easily  found  to  be 


p^+^  k  k  ^  if  the  last  step  is  horizontal  (l), 


and 


k  y-k  x-k  k  .  ,  ,\ 

Pi  Qi  Pn  qn  if  the  last  step  is  vertical  (0J. 


The  probability  p(7),  of  reaching  the  point  7  =  (x,y)  is 


,  ^  ^  /  \  V  „/  ,  \  k+1  y-k  x-k- 1  k 
1.2.2  p(7)  =  j  N(7,k)  p1  q^  P2  qg 


k  e  K 


k  e  K' 


N'(7,k) 


k  y-k  x-k  k 

Pi  qi  p2  %  • 


where  N(y,k)  and  N*(7,k)  represent  the  number  of  paths  to  the  point 

r 

7  =  (x,y )  with  k( 1,0)  joins  and  ending  in  a  1  or  a  0  respectively. 

The  index  set  K  is  a  subset  of  the  set  of  integers  0,1,2,...  min  (x-l,y) 


and  K' 


is  a  subset  of  the  integers  0 1,2, ... ,min(x,y) .  For  any  stopping  rule 


(sampling  plan),  we  can  completely  determine  the  distribution  by  deter- 

mining  K,Ke ,  N(7,k)  and  N5(7,k)  for  each  /  in  the  boundary.  Notice 

that  equation  1.2.2  is  the  obvious  two  parameter  analogue  of  the  one 
parameter  case  (c.f,  De  Groot  [6],  p.  82). 


1,3  Completeness  and  Estimation  in  the  Distributions  A  ,  A5  . 

In  this  section,  we  shall  use  the  method  of  generating  functions 
to  prove  the  completeness  of  the  distributions  A.  and  A^,  This  method 
has  the  advantage  that  it  enables  us  to  list  the  polynomials  estimable 
unbiasedly.  Then,  by  using  the  standard  Rao-Blackwe 11  procedure,  we 
obtain  the  unique,  unbiased,  minimum  variance  estimator  of  pjB 

Let  us  suppose  we  toss  the  coins  n  times  using  coin  1  on 

the  first  toss  (A  ).  Let  r  be  the  number  of  heads  obtained.  Let  p 

v  n '  n ,  r 

be  the  probability  of  obtaining  r  heads  in  n  tosses.  Then,  according 
to  1.2.2 


1.3.1 


n,r 


/  N(7,k)  p^+1  q" 


n-r-k  r-k-1  k 
Po 


^2  + 


k  e  K 


V1  hi/v  ,  \  k  n-r-k  r-k  k 
/  N'(/,k)  Pl  qx  p2  q2 

k  "e"  K' 
r-1 


k=0 


r-l\  k+1  n-r-k  r-k-#  k 
,k>l  ql  P2  q2  + 


r 


I 


n-r-k  r-k  k 
ql  P2  q2  ’ 


The  coefficients  N(7,k)  are  obtained  as  follows.  Since  there  are  k(l,0) 
joins,  this  implies  that  among  the  r-1  heads  occurring  before  the  last 
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head  there  will  be  k  changes  to  tails.  These  changes  may  occur  in 


r-1 


ways.  Also,  these  k  changes  occur  before  tails,  which  may  be  arranged 
in  ways.  The  total  number  of  ways  in  which  k(l,0)  joins  may 

appear  is  N(/,k)  =  ^  similar  argument  yields  N'(7,k). 

If  we  toss  the  coin  n  times  using  coin  2  at  the  first  toss, 

we  can  obtain  a  similar  expression,  p'  for  the  probability  of  obtaining 

n ,  r 

r  heads  in  n  tosses  in  the  game  A^,  i.e. 


1.3.2 


n,r 


k=l 

r 

I 

k=0 


'n-r-1 

k-1 


n-r-1 


kJ^l 


n-r-k  k  k  r-k 


Pi  q2  P2 


ki  ql 


n-r-k-1  k  k+1  r-k 


pl  q2 


It  is  clear  that  by  putting  p^  =  p^,  we  obtain  the  usual  one  parameter 


binomial  distribution 


In  what  follows,  we  shall  use  the  ideas  and  notation  developed 


in  Blackwell  and  Girshick  [1+]  Chapter  3  and  8,  to  describe  sample  spaces, 

sufficient  partitions  and  sufficient  statistics.  Let  /  =  (Z,  2,  p  ), 

/ —  w 

where  Z  consists  of  points  z  representing  outcomes  in  A  , 


ft  =  |(Pi’P2^pl  I  0  <  Px  <  1)  x  (P2  s  0  <  p2  <  1},  P1  /  pjj-  and  pw(z) 
is  the  probability  of  z  e  Z  given  weft. 


Let  us  consider  the  partition  S  of  Z  where 

S  =  (s^  ...  s1,  . . .  s1  ,  s2  . .  .  s2  ...  s2  1  .).  The  superscripts  1 
'  10  rk  no  00  mk  n-1,1' 

and  2  indicate  whether  the  last  step  in  the  sequence  z  was  a  head  or 
tail  and  the  subscripts  indicate  the  number  of  heads  and  the  number  of 
right  angle  turns  respectively.  It  is  easily  seen  that  S  is  a  sufficient 


■'  ; 
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it 
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partition.  Let  Y  =  (R,k),  where  R  =  +  r  according  as  the  sequence 
ends  with  a  head  or  a  tail.  The  number  of  successes  is  r  =  |R| .  Thus, 
Y  is  seen  to  be  a  sufficient  statistic. 

Lemma  1.5.1  Let  A.^(s),  A^(s)  represent  the  generating  functions  for 


the  probability  distributions 

A  and 

n 

A’.  Then 
n 

A  (s) 
nx  7 

■  qi 

A  (s) 

n-1  7 

+  Pis  An-l<s)' 

A '  ( s  ) 
n  ' 

“  q2 

A  (s) 

n-1  7 

+  P2s  A^ts). 

Proof :  We  shall  give  the  proof  for  ^n(s)  only,  as  the  proof  for  A^(s) 
is  similar.  The  proof  follows  from  the  fact  that  after  the  first  toss 
the  resulting  game  is  ^  or  depending  on  whether  the  first 

toss  resulted  in  a  tail  or  a  head. 


Lemma  1.5.2  The  number  of  distinct  values  taken  on  by  the  sufficient 
statistic  Y  (with  positive  probability)  for  A^  is  +  i# 

Proof :  Suppose  n  is  odd.  We  count  the  values  of  Y  (taken  on  with 

positive  probability)  as  follows 

1.  Let  R  >  0.  Then  each  sequence  ends  with  a  head  and  for  a 
given  R  the  possible  values  of  k  range  from  0  to  min  (R  -  1,  n  -  R) 

2.  R  =  0.  The  only  value  of  k  is  0. 

3.  R  <  0.  Each  sequence  ends  with  a  tail  and  k  cannot  be 
zero.  For  a  given  R,  k  ranges  from  1  to  min  (|R|,  n  -  j R | ) .  Thus 
the  number  of  values  taken  on  by  Y  is 

( 1  +  2  +  . . .  +  ^  +  2  “1+  ...+2+l)+(l)+ 

, .  „  n-ln-ln-3  \  n(n  +_l)  , 

( 1  +  2  +  ...  +  2  +  2  +  2  +♦..+;)-  2  +  '■ 
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where,  for  example,  the  first  bracket  indicates  the  distinct  values  of  Y 
in  case  (l).  A  similar  counting  procedure  yields  the  same  result  if  n 
is  even. 


Theorem  1.3.1  The  distributions  A  ,  A'  are  complete. 

- « —  n  n  r 

Proof :  Clearly  the  distributions  A^,  Aj  are  complete.  We  shall  follow 
the  idea  expressed  in  [6]  p.  97  to  show  A^  is  complete.  We  shall  prove 
by  induction  that  any  maximal  set  of  linearly  independent  polynomials  in 
p^  and  p^  which  are  estimable  unbiasedly  contain  —  +  1  elements, 

thus  showing  that  A^  is  complete.  For  A^,  such  a  set  is 


i  2  3 

1  pi  pi  pi 

P1P2  P1P22  P1P23 
2  2  2 
P1  P2  P1  P2 
3 

P1  p2‘ 


P 


k 

1 


It  is  clear  that  such  a  set  is  not  unique.  There  are  many  other  sets  of 
linearly  independent  polynomials  spanning  the  space  of  estimable  poly¬ 
nomials.  Another  such  equivalent  set  for  A^  is 

.  2  3  4 

1  Pl  pl  Pl  Pl 

plq2  Plq22  Plq2? 

2  2  2 
P1  q2  P1  q2 


We  shall  now  assume 

1.  that  the  distributions  A  ,  A*  are  complete  for  n  <  k. 

n  n 

2 .  that  the  basis  for  the  linear  space  of  estimable  polynomials 

in  A,  ,  A'  can  be  taken  to  be 
k  k 


» 

. 


. 


' 

* 

s 

’ 


■f 
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1.3.3 


1.3.U 


P1P2  P1P2 


Pi  P2 


plp2 


k-1 


2  2 
pl  p2 


2  k-2 


P1  p2 


k-2 


k-2  2 


P1  p2  P1  p2 
k-1 

P1  P2- 


A* 


Plq2 

Pt^2 


•  •  • 

k-1 

Pl  q2 
k-2  2 
Pl  q2 


Piq2 


Piq2 


k-2 

2 

k-1 


2  k-2 


Pl  q2 


It  is  easily  seen  that  the  polynomials  in  these  sets  are  linearly  independent. 

Now  apply  the  generating  function  relationship  of  Lemma  1.3. 1,  i.e. 

multiply  1.3*3  by  qx  and  1.3.^  by  p^.  The  resulting  set  contains  all 

the  polynomials  estimable  unbiasedly  in  From  this  set  we  wish  to 

select  the  maximal  linearly  independent  set  of  polynomials.  We  can  obtain 

this  maximal  set  as  follows,  q^  appears  when  we  multiply  I.3.3  by 

and  pj  appears  when  we  multiply  l.J.h  by  p^.  Since  q^  +  p^  =  we 

2  k 

can  replace  q^  by  1.  P^P2>  piP2  •••  P1P2  are  °btained  from  I.3.U 

*  1  *  1  * 

on  multiplication  by  p^  and  remain  unchanged.  Since  p^J  -  q^p^J  =  p^* 

i-1  i 

we  replace  q^p^ J  ,  j  =  2,3>.«.»k+l,  by  p^  . 


Similarly,  we  replace 


10  • 


j-1  £  j  •{ 

CllP]  p2  !jy  P1  p2  ’  j  =  3»4,  .  ..»  k»  i  =  k  -  2,  k  -  3,  .  ..,  1.  Since 


'ljq2  ‘  P1J  (1  "  (l)p2  O1)1  0  P2lN!  • 


we  can  remove  all  the 


remaining  polynomials  to  obtain  the  maximal  linearly  independent  set  1.3*3 

(k  +  l)(k  H-  2) 


with  n  =  k  +  1.  Since  each  set  contains 
the  proof  is  complete. 


+  1  elements, 


As  one  would  intuitively  expect,  p^  is  not  estimable  in  , 
since  it  is  conceivable  that  all  observations  result  in  a  tail  and  coin  2 
is  not  used  at  all. 


We  can  obtain  the  unique,  minimum  variance  unbiased  estimator 


p ^  of  p(  as  follows. 


Let  =  1  if  the  first  toss  is  a  head, 
=  0  otherwise. 


By  the  Rao-Blackwell  theorem, 


A 

Pi 


=  E(X]R,k) 

=  N*(7,k) 
N(/,k) 

N*'(7,k) 
"  N ' ( 7>k) 


k  €  K 


k  e  K'  , 


where  N*(7,k)  is  the  number  of  paths  in  each  of  the  two  cases  beginning 
with  a  head.  For  k  e  K,  N*(7,k)  =  k’')  ’  (”  k  k^  *  For 

k  £  K'  ,  N*'  (/,k)  -  (Y)  (fcli)  -  (T1)  Ck-l)  •  HenC^ 


P1  = 


t  )  -  ct1 

v  ■  -  ■  ■■  0  -  —  ■—  ■■ 

\  k  y 


k 

n-r 


/  n 


=  1 


r  =  n,  in  both  cases 


. 
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1.4  Completeness  and  Estimation  in  A  ,  A'  . 

 mn  mn 

In  this  section,  we  shall  consider  the  distributions  obtained  by 
varying  the  stopping  rule  in  the  model  described  by  1.2.1. 


Definition  1,4.1  Let  us  consider  the  model  1.2.1.  If  we  stop  whenever 
m  ones  (heads)  or  n  zeroes  (tails)  appear,  the  resulting  distribution 
is  called  A^,  If  initially  p(X^  =  l)  =  p^>  then  the  resulting  dis¬ 
tribution  is  called  A'  .  Thus,  the  sampling  plan  which  yields  the 

m,n 

distributions  A  ,  A*  has  boundary  points  (m,0)  (m,l)  ...  (m,n-l) 

mn  mn 

(0,n)(l,n)  ...  (  m-l,n).  Let  k  be  the  number  of  (1,0)  joins.  According 

to  1.2.2,  the  probability  of  reaching  a  boundary  point  (m,y)  is  given  by 

min(m-l,y) 


1.4.1 


y\  /m-l\  k+1  y-k  m-k-1  k 

k  )  (  k  )  P1  ql  P2  q2 


k-0 


and  that  of  reaching  a  boundary  point  (x,n)  is  given  by 
min(x,n) 

.  ,  .  \  /n\  /x-l\  k  n-k  x-k  k 

u4-2  /  (k)  k-i)pi  «>i 


p2  q2  • 


k=l 


If  we  let  k  be  the  number  of  (0,1)  joins  in  A*  ,  the  proba 


mn 


bility  of  reaching  a  boundary  point  (m,y),  y  >  1,  is 
min(y ,m) 

/y-l\  /m\  k  y-k  _  m-k 


1.4.3 


k-1 


ky  pi  *i 


k=l 


and  the  probability  of  reaching  the  point  (x,n)  is 
min(x,n-l) 

...  \  /n-l\  AO  ^  n-k-1  x-k 

^  L  (O' k)Pl  p2  q 

k=0 


k+1 


.  i 


•  J 

- 


V 
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We  shall  use  the  method  of  generating  functions  developed  in 

1.3  to  prove  that  the  distributions  A  ,  A'  are  complete.  Although 

mn  mn 

this  method  cannot  readily  be  extended  to  distributions  determined  by 
more  general  stopping  rules,  it  does  provide  some  insight  into  the  nature 
of  these,  distributions.  As  in  1«3>  we  can  define  a  sufficient  partition 
and  a  sufficient  statistic  T  =  (N,k),  where  N  =  x  or  N  =  -y  depending 
on  whether  the  path  is  to  the  boundary  point  (x,n)  or  (m,y).  The 
following  lemmas  determine  the  number  of  values  T  takes  on  with 
positive  probability  and  the  generating  function  relationship  between  the 
distributions . 

Lemma  1 . 4 . 1  The  sufficient  statistic  T  takes  on  mn  +  1  values  with 
positive  probability. 

Proof :  Let  m>  ,n.  Since  k  ranges  from  0  to  min  (m-l,y)  and  from 
1  to  min  (x-l,n)  the  total  number  of  values  is  1+  1+2+3+  .  ..  n+ 
(m-  n  -  l)n+  1+2  ...  +n=mn+  1. 

If  m  <  n,  a  similar  counting  procedure  yields  1  +  1  +  2  + 

...  +  m-1  +  1+  2+  ...  +  (m- l)  +  ...  (n-m  +  1 )m  =  mn  +  1 . 


Lemma  1.4.2  Let  A  (s)  and  A*  (s 
- - - - —  mn  mn 

for  the  probability  distributions  of 


A  (s)  =  q  A 
mn  3  m,n- 


A'  (x)  =  q_  A 
mn 


2  m,n- 


)  represent  the  generating  functions 


A 

mn 

and 

A'  . 
mn 

Then 

l(s) 

+  PjS 

A’ 

m- 

1  (s)* 

l,n'  ' 

l(s) 

+  P2S 

A' 

m- 

1  (s) . 

l,nN 

Proof:  The  proof  follows  from  the  fact  that  after  the  first  trial,  the 

remaining  trials  form  a  distribution  for  A”  .  and  A  respectively. 

m- 1  111,11-1 


V 

I 

- 


' 


-  13 


The  proof  of  Theorem  1.4.1  parallels  that  of  Theorem  1.3. 1, 
just  as  the  two  lemmas  proven  above  are  analogues  of  lemmas  I.3.I  and  1.3.2. 


Theorem  1,4.1.  The  polynomials  estimable  unbias edly  in  A  and  A5 
-  y  m,n  m,n 

are  linear  combinations  of  the  following  sets  of  polynomials. 


A  . 
mn 


n 


plp2  P1  P2 


n 


P1  P2 


1.4. 5 


2  2  2 
P1P2  P1  P2 


n  2 
P1  p2 


P1P2 


m-i 


2  in—  X 


P1  p2 


n  m-1 
P1  P2 


A' 

mn 


1  q2 


n 


1.4.6 


Vh  qa  qi 


,  n 

Ip  q  ] 


Vi 


m- 1  2  m-1 

q2  ql 


.  n  m-1 
’  ”  q2  ql 


Proof;  Firat  we  consider  the  polynomials  for  the  games  A^,  , 

Aj^,  ^ml*  ^or  games,  the  polynomials  are; 


(«) 

pl 

Vi 

2 

ql  P1 

n-1 

qi  pi. 

n 

qi 

(b) 

ql 

plqi 

pip2q2 

m-2 

plp2  q2 

m-l 

plp2 

(c) 

P2 

Vt 

Wi  ••• 

q2qi°  2pl 

n-1 

Vl 

' 


. 

•  0  • 


m 


lb  - 


(d)  q2 


P2q2 


P2  q2 


m- 


1 


These  polynomials  can  be  replaced  by  the  equivalent  sets 


(a') 

1 

pl 

T) 

t-» 

ro 

•  0  0 

n 

P1  or 

i  2 

1  qi  qi 

0*0 

n 

ql 

(b*) 

1 

pl 

P1P2 

•  0  0 

m-2 

P1P2 

m-1 

P1P2 

(C) 

1 

q2 

q2ql 

0  0  © 

n-2 

q2ql 

„  n-1 
q2qi 

(d') 

1 

P2 

2 

P2 

•  •  • 

m 

P2  or 

1  q2  q2 

•  00 

m 

q2 

Thus,  the  theorem  is  true  for  A  ,,  A,  ,  A'  A'  . 

ml  In  ml  In 

Proceeding  by  induction  and  using  the  generating  function 
relationship,  we  obtain  the  following  sets  of  polynomials. 


(a)  qxPx  q1P1‘ 


qipi 


n-1 


qiPlP2 


n-1 

qlPl  p2 


and 


qiPlP2 


m- 1 


qipi p 


2  m-1 

'2 


qlPl 


n-1  m-1 


r 

(*>)  Pi  Pi^  PxV 


Plq2ql  Plq2  ql 


Plq2 

plq2nql 


m-2 


Plq2  q 


m-2 

1 


n 

°  *  *  ^2 


q 


m-2 

1 


Combine  p^  and  and  replace  q^  by  1.  P^'*»  j  =  rs 

obtained  by  combining  p^  ^  and  q^p^  ^  and  replacing  q^p.^  ^  by 
PlJ.  p^p  X,  J  =  1*2,. ..,n,  i  =  1,2, .. .  ,m-l  is  obtained  from  p^  p 2 


,p  ' 


The  r ema in in g 
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and  q]P]j  1p21  and  replacing  q-jP^'S^1  by  p^p^. 

j  i 

polynomials  P-j^  J  -  l,2,...,n,  i  =  l,2,...,m-l  are  easily  seen 

to  be  linear  combinations  of  the  polynomials  already  obtained  and  can  be 
omitted.  Thus,  the  maximal  set  of  linearly  independent  polynomials  is 
given  by  1 . U . 5 «  by  multiplying  by  q 2  and  p^  and  going  through  a 
similar  procedure  as  above,  we  obtain  the  maximal  set  1.4.6. 


Corollary.  The  sufficient  statistic  Y  is  complete. 

Proof ;  The  polynomials  1.1+. 5  and  1.4.6  span  the  space  of  estimable 
polynomials  and  are  linearly  independent.  The  polynomials  given  by 
1.4.1,  1.4.2  and  1.4.3>  1.4.4  also  span  the  space  of  estimable  polynomials. 
Since  the  number  of  these  polynomials  is  by  Lemma  1.4.1  mn  +  1,  they 
must  be  linearly  independent. 


We  conclude  this  section  by  pointing  out  that  the  results  of 

1.5  and  1.4  are  a  generalization  of  the  one  parameter  case.  We  have 

already  seen  that  putting  p^  =  p^  in  1.3. 1  and  1,3. 2  yields  the  one 

min(x,z) 

parameter  binomial  distribution.  Since  E 

k=0 


a 


similar  one  parameter  distribution  is  obtained  by  putting  p^  =  p^  in 

1.4.1  and  1.4.2.  Also,  De  Groot  [6]  has  proved  that  in  a  sampling  plan 

of  size  n,  all  polynomials  of  degree  at  most  n  are  estimable  unbiasedly. 

But  it  is  easily  seen  that  putting  p^  =  p^  in  I.3.3  an^  1.3»^  gives  us 

the  estimable  polynomials  of  degree  at  most  n„  Similarly,  if  we  put 

p^  =  p^  in  1.4.5  and  1.4.6,  we  obtain  all  the  estimable  polynomials 

of  degree  at  most  m  +  n  -  1.  Since  the  size  of  A  (A')  is  n,  and 

the  size  of  A  (A’  )  is  m  +  n  -  1,  we  have  obtained  a  direct  general- 

mnx  mn' 

ization  of  De  Groot ?s  result  for  these  cases. 


■  o' 

»  * 

* 


' 


' 


1 . 5  Simple  Sampling  Plans  and  Vectors  with  Non-Negat ive  Integral  Components. 


Simple  sampling  plans  were  first  studied  by  Girshick,  Mosteller 
and  Savage  [9]  in  connection  with  the  estimation  of  the  parameter  in  the 


binomial  distribution.  They  defined  a  sampling  plan  to  be  simple  if  no 
two  continuation  points  on  the  line  x  +  y  =  j,  j  =  0,1,2, ...  are 
separated  by  boundary  or  inaccessible  points.  Formally,  we  state  this 
definition  as  follows. 


Definition  1,5.1  Let  m.  =  E  X  .  A  sampling  plan  is  said  to  be 

J  i=l  1 

simple,  if  and  only  if,  for  any  two  continuation  points  (m.,  j -m . )  and 

J  J 

(m*  ,  j-m'),  the  points  (m.  +  1,  j  -  m,  -  l),  (m.  4-  2,  j  -  m.  -  2) 

Jj  J  ~  J  +  J 


J 


(m!  -  1,  j  -  m^  +  l)  are  also  continuation  points. 

Since  m.  is  the  total  number  of  ones  in  the  first  i  terms 
J 

of  the  sequence  X^,  X^,  .  .  .  ,  X^ ,  ...,  the  points  (mj»  3  "  mj)  and 
(m^,  j  ~  mj)  on  line  x  +  y  =  j.  Thus  our  definition  1.5. 1  is 

clearly  equivalent  to  the  above  definition  which  states  that  all  the 
points  between  any  2  continuation  points  lying  on  the  line  x  +  y  =  j 


are  also  continuation  points. 


An  alternative  and  perhaps,  a  more  intuitive  way,  of  characterizing 
the  simplicity  of  a  sampling  plan  of  size  n  is  by  means  of  the  number  of 
boundary  points.  De  Groot  [6]  first  proved  the  interesting  result  that  a 
sampling  plan  of  size  n  contains  at  least  n  +  1  boundary  points.  From 
this  result  and  one  of  Girshick,  Mosteller  and  Savage  [9l>  it  follows  that 
a  simple  sampling  plan  of  size  n  has  exactly  n  +  1  boundary  points. 

This  result  was  obtained  in  a  direct  way  by  3rainerd  and  Narayana  [5]. 


Mohanty  and  Narayana  [13]  used  this  result  to  obtain  a  characterization 
of  simple  sampling  plans  of  size  n  as  vectors  with  n  +  1  non-negative, 
integral  components  satisfying  certain  conditions.  Since  we  shall  use 
this  characterization  and  some  of  the  results  arising  from  it  in  later 
chapters,  these  results  are  summarized  in  this  section. 


Definition  1.5.2  A  simple  sampling  plan  is  characterized  as  a  vector 
a^  =  (a^,  a2>  •  ••>  an+^)»  t^ie  a^'s  being  non -negative  integers 
satisfying  the  following  conditions: 


0, 


(a)  There  exists  an  integer  i,  1  <  i  <  n  such  that  a^  = 

a.  ,  =  0,  i.e.  at  least  2  consecutive  a's  in  the  vector  are  zero, 
l+l 

(b)  Let  k  be  the  smallest  integer  i  such  that  a^  =  a  ^  = 

Then  a,  >  a.  >  a_  >  a.  -  >  0  and  0  <  a.  ^  <  a,  ,  . . .  <  a  . 

1—2—3  —  k-1  —  k+2  “  k+3  —  n+1 

(c)  Let  B  be  the  set  of  vectors  (b-,  b  ,  ...,  b  )  where 

1  c.  K 

b =  a^  (i  =  l,2,...,k-l)  and  let  C  be  the  set  of  vectors 
(c^,  c^,  ...,  c^  where  c^  =  an+p  ^ ~  l,2,...,n-k.  The  b's 

stand  for  the  a's  with  indices  less  than  or  equal  to  k-1  and  the 
c's  for  a's  with  indices  greater  than  or  equal  to  k+2.  Then 

b.  <  n  -  j  and  c^  <  n  -  Further,  if  b.  =  n  -  j  -  r  (r  =  0, 1,2, . . . ,n- j-l) 


J 

then 


j 


fn  -  p  for  p  =  l,2,...,r 

p  —  [n  -  p  -  j  for  p  >  r. 


Similarly,  if  o.  =  n  -  &  -  r  (r  =  0, 1 , . . . ,n-i) ,  then 


b  < 


In  -  p 


for  p  =  1,2> . . . ,r 


p  —  1  n  -  p  -•  2  for  p  >  r 


It  was  shown  in  [ 1 3 ]  that  to  each  vector  a  there  corresponds 


a  simple  sampling  plan  of  size  n  and  conversely.  Thus,  a  simple  sampling 


« 


' 
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plan  of  size  n  can  be  represented  by  a  vector  of  n  +  1  non-negative, 
integral  components,  each  component  representing  the  amount  of  "displace¬ 
ment"  of  the  boundary  point  from  the  line  x  +  y  =  n. 

Another  way  of  representing  a  simple  sampling  plan  as  a  vector 

was  also  developed  in  [I3].  It  was  shown  that  a  simple  sampling  plan 

of  size  n  can  be  represented  as  a  vector  A  ,  =  (a,,  a  .  . ...  a  ,  ) , 

n-1  12  n-1' 

the  a,/s  being  non-negative  integers  satisfying  the  following  conditions. 


1.5.3 


(1) 

ai  <  a~ 

0  0  e  ^  cl  « 

1-2 

-  n-1 

(2) 

a  £  2i 

1  —  l,2,ooo ,n-l 

In  both  the  characterizations  1.5.2  and  1.5*3  °f  simple  sampling  plans  of 
size  n,  a  key  role  is  played  by  the  relation  of  domination  of  one  sampling 
plan  by  another.  The  concept  of  domination  of  vectors  with  non-negative 
integral  components  was  first  introduced  in  [ 1 5 ]  and  has  been  used  in 
various  ways  to  study  many  combinatorial  problems  involving  lattice 
paths.  We  shall  exploit  the  idea  of  domination  in  Chapter  2  to  obtain  the 
relationship  among  the  numbers  of  estimable  polynomials  and  domination 
with  reference  to  certain  sampling  plans. 

We  consider  vectors  A  =  (a,,  a.,  < 

n  12 

are  non-negative  integers  satisfying  1 . 5 • 3 • 


,  a^)  whose  components 


Definition  1.5.4  The  vector  A  =  (a,,  a^,  ...,  a  )  dominates  the 
- -  n  i  d  n 

vector  =  (b^,  b  ,  ...»  b^)  (we  write  A^  d  B^)  if,  and  only  if 

>  b  „  for  i  =  l,2,...,n.  We  shall  say  that  one  sampling  plan  of 
size  n  dominates  another  sampling  plan  of  size  n,  if  the  corresponding 
vectors  dominate  each  other  according  to  Definition  1.5.4. 
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Clearly  the  relationship  of  domination  is  a  partial  order  and 
it  can,  in  fact,  be  shown  that  under  this  relationship  the  set  of  vectors 
A  ,  or  equivalently  the  set  of  all  simple  sampling  plans  of  size  n, 
form  a  distributive  lattice.  We  shall  have  occasion  to  refer  to  this 
idea  in  Chapter  2. 

1 o 6  Deformations  of  Simple  Sampling  Plans . 

This  section  is  a  summary  of  the  results  on  deformations  of 
sampling  plans  introduced  in  [5]  to  prove  the  equivalence  of  the  definitions 
of  simple  sampling  plans  as  given  by  Girshick,  Hosteller  and  Savage  [9]  and 
that  given  by  Definition  1.5.2.  Since  these  deformations  also  play  an 
important  role  in  determining  the  number  of  estimable  polynomials  in  a 
sampling  plan,  we  shall  give  a  brief  summary  of  their  properties  and 
define,  rather  precisely,  a  special  type  of  deformation  called  a  canonical 
deformation. 

The  concept  of  deformation  was  defined  in  [53  for  all  sampling 
plans  with  essential  boundaries.  Since  we  are  only  concerned  with  the 
special  case  of  simple  sampling  plans,  we  specialize  the  definition  of 
deformation  given  in  [53  to  this  case. 

Definition  1.6.1  Let  S  be  a  bounded  simple  sampling  plan.  Let 

7^  =  (x^,  y  )  be  a  boundary  point  of  S.  Consider  the  new  sampling  plan 

S'  obtained  from  S  as  follows*. 

(i)  the  boundary  of  S"  consists  of  the  boundary  of  S 
except  that  becomes  a  continuation  point. 

(ii)  to  the  boundary  points  of  Ss  obtained  in  (i)  add  one  or 
both  of  the  points  7.  =  (x  ,  y  +  l),  70  =  (x  +  1,  v  )  according  as 

1  OO  d.  O  O 


one  or  both  of  the  points  7^,  7^  were  inaccessible  in  S.  Then  S' 

is  a  deformation  of  S  at  the  boundary  point  7  .  The  sampling  plan 
S'  is  obtained  from  S  by  changing  one  of  the  boundary  points  to  a 
continuation  point  and  adding  to  the  boundary  one  or  both  of  the  nearest 
inaccessible  points. 


As  an  illustration  of  the  three  possibilities  that  can  occur 
in  Definition  1.6.1,  we  consider  the  sampling  plan  S  of  size  6, 
given  in  the  notation  of  1.5.2  as  (0, 0,0,0, 1 , 3, 3) >  (cf.  fig.  l).  For 
a  deformation  at  the  boundary  point  7  =  (2,l),  (marked  CD  in  fig.  l). 
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replace  7 by  the  point  (3,1), 
since  the  point  (2,2)  is  a  continua¬ 
tion  point  in  S.  For  a  deforma¬ 
tion  of  the  boundary  point 
7q  =  (3,2),  (marked  X  in  fig.  l), 
replace  7q  by  the  point  (4,2), 
since  (3,3)  is  a  boundary  point 
in  S.  For  a  deformation  at  the 
point  7q  =  (3,3)»  we  would  replace 
7  by  the  points  (3,4)  and  (4,3), 


since  both  of  these  points  are  inaccessible  in  S 


In  considering  the  above  example,  it  is  seen  that  if  we  deform 
at  7  =  (3,0),  the  resulting  plan  S'  would  be  non-simple.  Since  we 

wish  to  avoid  this  situation,  we  make  the  following  definition. 

Definition  1.6.2  A  deformation  of  S  into  S'  is  said  to  be  admissible 

if  S'  is  simple.  In  what  follows,  we  shall  only  consider  admissible 


deformations . 
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The  following  theorem,  provided  in  [5],  gives  the  basic  result 
for  our  work. 


Theorem  1,6,1.  Let  S  be  any  simple  sampling  plan  of  size  n.  Then 
there  exists  a  sequence  of  sampling  plans  S  =  S  ,  S  ,  ...,3,  such  that 

O  1  K 

S .  i  is  an  admissible  deformation  of  S,  (i  =  0, 1,2, . . . ,k- l)  and  S, 

i+ i  i  '  7  k 

has  exactly  the  points  of  index  n  as  boundary  points. 


There  exists  a  certain  sequence  of  admissible  deformations  of 
any  simple  sampling  plan  of  size  n  which  is  of  special  interest,  since 
it  enables  us  to  count  the  number  of  right  angle  turns  taken  by  any  path 
to  any  boundary  point  at  each  stage  of  the  deformation.  This  sequence, 
which  we  call  the  canonical  sequence,  enables  us  to  determine  the  number 
of  estimable  polynomials  in  the  basis  of  the  linear  space  of  estimable 
polynomials  for  a  large  class  of  sampling  plans. 


Definition  1.6.5  Let  S  =  (0,  . .  .  ,  0)  be  the  ’’fixed”  sampling  plan 
of  size  n.  Let  S'  =  (a^a^  a^,  0,...,0,  aj+1»aj+2>  ...,  ak+1) 

be  any  other  simple  sampling  plan  of  size  n,  where  the  a,'s  satisfy 
the  conditions  of  Definition  1.5.2.  The  sequence  of  admissible  deformations 
of  S’  into  S  given  below  is  called  a  canonical  sequence  of  deformations. 


S '  =  S  = 
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0,  a  .  - — 1 ,  a.  , 

j+1  j+2’ 

...  an+1) 

=  (ar 

V 
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1  +  Z  a. 

•  1  1 
J+1 


n 

ak-l  +  Z  ai 
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k-1  n 

Z  a  +  Z  a  . 

2  1  j+1  1 


(al*  0  °’  an+l^ 


k-1  n 

Z  a  +  Z  a  „  +  1 
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2  J+1 


(a  ,  0  ...  0,  a  ,  -  l) 
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k-1  n+1 

Z  a  +  Z  a . 

0  r  .  1  i 

2  j+1 


(a^ ,  0  ...  0,  0) 


k-1  n+1 

Za  +  Z  a,  +  a.  -  1 
r 

2  j+1 


=  (1,  0  ...  0,  o) 


s  “  Sk-1  n+1  -  (o,  0  ...  0,  0) 

Z  a  +  Z  a . 

i  r  *  i  1 

1  J+1 


Viewed  according  to  definition  1.6.1,  we  perform  each  deformation 
by  moving  the  "innermost”  boundary  points,  one  step  at  a  time,  along  the 


n  —  j  —  1, 


lines  x  =  k  -  2 ',''2*  1, 


9  •  • 


0  and  y  =  n  -  j, 


1,  0.  Since 


■  V 


satisfy  the  conditions 
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at  each  step  of  the  sequence  the  components  a ^ 
of  1.5.2,  each  step  represents  an  admissible  deformation.  Moreover, 
because  we  are  always  performing  the  deformation  at  an  "innermost"  boundary 
point,  the  situation  that  arose  at  the  boundary  point  (5,0)  in  the 
example  can  never  occur.  Instead  of  starting  with  S’  and  moving  the 
"innermost"  boundary  points  until  we  reach  S,  the  same  sequence  can  be 
obtained  by  starting  with  S  and  moving  the  "outermost"  boundary  points 
one  step  at  a  time.  Thus,  this  particular  sequence  of  deformations  can 
be  performed  in  reverse  order  and  we  write  this  as  the  canonical  sequence 
of  deformations  from  S  to  S'.  Since  we  are  primarily  interested  in 
what  happens  as  we  move  from  S  to  S’  in  this  sequence,  we  shall  always 
refer  to  the  canonical  sequence  of  deformations  from  S  to  S’,  as  this 
causes  no  confusion. 


As  an  example  of  canonical  sequences  and  our  terminology, 
consider  the  canonical  sequence  for  deforming  S  =  (0,  0,  0,  0,  0,  0)  into 
S’  =  (2,  2,  0,  0,  1,  2). 


S 


S’ 


=  (0, 

0, 

0, 

0, 

0, 

0) 

*  (1. 
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0S 
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2) 
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2) 

=  (2, 
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1 . 7  Optimal  Sampling  Plans „ 

In  each  of  the  previous  sections,  we  have  been  given  a  binomial 

sampling  plan  ecg.  the  fixed  size  plan  A  or  A  ,  and  we  investigated 

n  mn 

certain  properties  pertaining  to  these  plans.  In  many  cases  it  is  useful 
to  determine  which  sampling  plan  is  "best”  to  use  under  certain  conditions. 
Such  optimal  sampling  plans  have  been  described  in  a  very  general  setting 
by  Blackwell  and  Girshick  [4],  Chapters  9  and  10,  and  a  general  procedure 
for  finding  such  plans  developed.  In  a  later  chapter,  we  shall  investigate 
the  relationship  between  these  optimal  plans  and  the  simple  sampling  plans 
described  in  the  previous  sections.  For  this  purpose,  we  shall  briefly 
summarize  some  of  the  results  given  in  Blackwell  and  Girshick  [4],  The 
notation  used  will  be  that  of  [4]  in  general,  but  for  convenience  we  make 
some  notational  changes  which  shall  be  indicated.  It  must  be  emphasized 
that  this  summary  is  not  complete  and  proofs  and  other  details  may  be 
found  in  [4],  Chapters  9  and  10. 

Definition  1.7.1  Let  ^  =  (X,  Q,  p  )  be.  the  sample  space  where 

""  w 

(i)  X  =  {x  :  x  =  (x^,  x^  ...  x^)  x^  =  Oor  1,  i  =  1,2,...,N] 
(ii)  (19  O'  =  (w  :  0  <  w  <  1} 

( iii)  P  M  =  wSXi  (1  ‘  W)N  ZXi 

W 

Definition  1.7.2  Let  A  be  any  arbitrary  space  -  the  space  of  terminal 
action V 

Definition  1.7.5  Let  c^(x),  j  =  0,1,2,...,N  be  a  set  of  bounded  non¬ 
negative  functions  on  J  *  X,  where  J  =  [O, 1 ,2, . . . , j )  and  such  that  if 
x,y  e  X  and  xi  =  y^  i  =  l,2,...,j^  then  cj(x)  =  cj(y)» 


■ 

* 

. 
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Definition  1,7.4  L(w,a)  is  a  bounded,  non-negative  function  defined 
on  ft  X  A. 

Definition  1,7.5  Let  3  be  the  class  of  a  priori  distribution  over  ft. 

Definition  1.7.6  For  a  fixed  f  e  H  and  any  bounded  function  h  on 

- -  j> 

ft  x  X  ^ let  E^(h)  be  the  conditional  expectation  of  h  given  x^,  x^,  . .., 
x j ,  when  w  has  the  distribution  |  and  for  a  fixed  w,  x  has  the 
distribution  p^.  (Since  the  (•  in  our  discussion  is  always  fixed,  we 
suppress  the  £  as  given  in  Blackwell  and  Girshick  [4],  p.  239)  • 

Definition  1.7.7  Let  cpj(x)  3 

Definition  1.7.8  Let  Uj(x)  ~  cj(x)  +  Tj(x)* 

Definition  1,7.9  Let  a^(x)  »  UN(x)  and  by  induction  backward  define 

aj(x)  *  min  [U^{x),  Ej(aJ+1(x))  ],  j  <  N. 


The  following  theorem  is  proved  in  [4]. 

Theorem  1,7.1.  Let  £  be  fixed.  Let  S  »  (S^,  S^,  , S^)  where 
»  {x  i  U^fx)  >  a^(x)  for  i  <  j,  Uj(x)  »  a,j(x)).  The  sequential 
sampling  plan  S  is  Bayes  against  | ,  i.e,  p(£,  S  )  »  mjji  p(£,  S). 
(p(g,,  S)  is  the  risk  function), 


The  optimal  sequential  procedure  is  characterized  as  follows; 
at  the  jfch  stage  we  compare  the  present  risk  U^(x)  with  the  average 
risk  ij(x)  of  going  on  if  we  do  the  boit  m  can  with  future  observation! . 


We 


stop  sampling  if  ct^(x)  «  U^(x)^  and  we  eak#  anoth#r  observation  if 


aj(x)  a  Ej(aJ+1(x)), 
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The  above  theorem  specifies  a  general  procedure  for  obtaining 
optimal  sampling  plans.  In  a  later  chapter,  we  shall  consider  the  special 
case  of  sequential  dichotomies  for  which  the  cost  function  is  linear  and 
the  observations  are  independent.  In  this  special  case,  the  optimal 
procedures  of  Theorem  I.7.I  can  be  modified  in  the  following  way.  (cf. 

[U]  sec.  9*3) •  Rather  than  consider  partitions  S*  of  the  sample  space 
X,  we  consider  regions  S  ^  of  the  space  S  (cf.  1.7. 5).  These  regions 
Sj  consist  of  points  £  which  are  the  aposteriori  probabilities  resulting 
from  £  and  the  known  probabilities  of  the  first  j  coordinates  of 
x  e  X.  The  optimal  procedure  is  then  characterized  as  follows.  Determine 


the  regions 


J0  ~  “1*  ^2: 


. 2L.  Given  a  particular  f-,  if  £  e 


‘-'N 


take  no  further  observations.  If  £  a  smaller  risk  will  be  incurred 

if  we  take  another  observation  and  follow  the  optimal  procedure  S 


* 

N~1 


Compute  ^  from  £  and  the  probability  distribution  of  x^.  If 
^6^1  stop,  otherwise  take  another  observation  and  compute 
Proceeding  in  this  manner,  we  are  sure  to  stop  after  at  most  N  observa¬ 
tions,  since  5^  =  5. 


We  shall  now  interpret  these  results  in  terms  of  the  binomial 

sampling  plans  considered  in  the  previous  sections.  It  is  well  known 

that  for  independent  Bernoulli  observations  the  statistic  (j,  m  ), 

J 

m  =  X  x  ,  is  a  sufficient  statistic.  Thus,  we  can  base  our  optimal 
j  i=i  1 

procedure  on  the  sufficient  statistics  rather  than  directly  on  the  observa- 

* 

tions.  Since  the  sets  of  the  partition  S  now  are  given  in  terms  of  m  , 
the  stopping  regions  these  sets  determine  also  specify  the  boundary, 
continuation  and  inaccessible  points  in  the  sampling  plans  as  defined  in 
1.5.  In  terms  of  boundary,  continuation  and  inaccessible  points,  the 


% 
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optimal  sampling  plans  can  be  interpreted  in  the  following  manner,  where 
a j ,  lh  and  are  those  of  definitions  I.7.9,  1.7.8  and  1.7„6 

respective ly . 

Lemma  1.7.1.  a,(m.)  =  E,  a,  ,(m,)  ,  i  <  N,  if  and  only  if  (m.,  i  -  m.) 

- 1 -  3  3  3  \  J+l  3  J  3  3 

is  a  continuation  point  or  an  inaccessible  point. 


Lemma  1.7.2.  a.(m.)  =  U.(m.),  i  <  N,  if  and  only  if,  (m.,  i  -  m.)  is 

- - -  J  J  J  J  “  3  3 

a  boundary  point  or  an  inaccessible  point. 


While  the  proofs  follow  directly  from  the  definitions  of 
and  those  of  continuation  and  boundary  points,  we  have  stated  these 
results  as  lemmas  for  later  reference  purposes.  Since  an  inaccessib 
point  can  never  be  reached,  it  is  immaterial  whether  or  E 

for  this  point. 


a. 

J 

le 

3 


O' !  ■ 
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CHAPTER  II 

COMPLETENESS  AND  ESTIMABLE  POLYNOMIALS  IN  SIMPLE  SAMPLING  PLANS 

2. 1  Introduction 

In  their  fundamental  paper  [9],  Girshick,  Mosteller  and  Savage 
studied  unbiased,  sequential  estimation  in  one  parameter,  binomial  sampling 
plans.  They  showed  how  to  estimate  unbiasedly  polynomials  in  the  para¬ 
meter  p  and  gave  conditions  for  the  completeness  of  these  sequential 
plans.  These  results  were  extended,  among  others,  by  De  Groot  [6]  who 
proved  the  following: 

1.  All  polynomials  (in  p)  of  degree  at  most  n  are  estimable 
unbiasedly  for  every  simple  sampling  plan  of  size  n. 

2.  If  the  boundary  of  a  sampling  plan  of  size  n  contains 
more  than  n  +  1  points,  then  the  plan  is  not  complete. 

3.  If  the  boundary  of  a  sampling  plan  of  size  n  contains 
exactly  n  +  1  points,  the  plan  is  complete. 

We  have  already  investigated  completeness  and  estimation 
properties  of  two  sampling  plans  in  two  parameters  in  Chapter  I.  In  this 
Chapter,  we  shall  try  to  extend  De  Groot* s  result  to  a  wider  class  of 
two  parameter  binomial  sampling  plans  and  we  shall  consider  certain 
enumeration  problems  which  arise  out  of  the  discussion. 

In  section  2.3,  we  prove  the  completeness  of  a  family  of 

distributions  called  C  and  list  the  polynomials  estimable  unbiasedly. 

n 

We  also  examine,  in  general,  two  parameter  sampling  plans,  to  compare  and 
contrast  certain  properties  of  completeness  and  estimation  to  those  of 


» 

. 

* 

. 
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one  parameter  sampling  plans  considered  by  De  Groot  [6].  In  particular, 
we  shall  prove  theorems  about  two  parameter  sampling  plans  analogous  to 
the  three  theorems  of  De  Groot  as  described  above. 

While  we  are  unable  to  list  explicitly  all  the  polynomials 
estimable  unbiasedly  in  general,  in  Section  2.3  we  determine  the  dimension 
of  the  linear  space  of  polynomials  estimable  unbiasedly  for  the  class  of 
plans  we  call  "proper”  and  give  an  upper  bound  for  any  simple  sampling 
plan. 


The  remaining  sections  deal  with  enumeration  problems  concerned 
with  regular  sampling  plans  which  arose  in  previous  sections.  We  prove 
some  combinatorial  theorems  analagous  to  Narayana  [18],  which  enumerate^ 
when  specialized^ the  number  of  regular  sampling  plans.  We  also  consider 
the  relationship  of  domination  of  sampling  plans  to  the  number  of  estimable 
polynomials  in  the  plans. 


2.2  Completeness  and  Estimable  Polynomials. 

Before  we  proceed  to  define  the  sampling  plans  denoted  by  C^, 
we  shall  recall  certain  definitions  and  prove  several  results  for  finite 
sampling  plans. 


Let  R  be  the  set  of  continuation  points  and  boundary  points 

determined  by  a  sampling  plan  of  size  n.  To  each  boundary  point  in  R, 

there  exist  lattice  paths  which  can  be  thought  of  as  observations  on 

chance  variables  X, ,  X_,  ...,  X  ,  ...,  where  each  X  has  the  probability 

1  2  n  n 

distribution  defined  in  1.2.  Thus,  to  each  R  there  corresponds  a 
sample  space  l  -  (Z,  Q,  p)  where  z  e  Z  represents  an  observed  sequence 


■  • 

- 

. 

■ 


'  : 


:  . 
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of  values  . ..,  ft  =  {(0  <  px  <  l)  x  (0  <  p2  <  l),  px  +  p  },  and 

Pw(z)  represents  the  probability  of  a  sequence  z,  given  a  value  wed, 


According  to  1.2.2,  p  (z)  is  given  by 

w 


2.2.1 


Pw(z)  =  Pi+i  q. 


k+1  y-k  x-k-1  k 


q2  or 


2.2.2 


/  \  ts.  y  ~  k. 

pw(*)  "  pi  qi  p 


k  y-k  x-k  k 

'a  qa 


depending  on  whether  the  last  observation  of  z  was  a  1  or  a  0. 
k  is  the  number  of  (l,0)  joins  and  (x,y)  is  a  boundary  point  of  R. 

We  define  a  minimal  sufficient  partition  on  the  sample 

space  Z  as  follows.  Let  each  s  e  S  consist  of  all  those  z  for 

which  p  (z)  is  constant  i.e.  if  z.  e  S,  z_  e  S.  then  p  (z„)  = 
w  12  w  1 

P  (O  for  all  weft.  On  this  minimal  sufficient  partition  a  sufficient 
statistic,  t,  can  be  defined  in  such  a  way  that  for  each  z  e  ^),  t(z) 
will  have  the  same  value.  The  sufficient  statistic  t  will  take  on  as 
many  values  as  there  are  sets  in  the  minimal  sufficient  partition.  We 
need  only  consider  estimators  which  are  functions  of  the  sufficient 
statistic.  Thus,  we  consider  an  estimator  to  be  a  real-valued  function 
f(S)  defined  on  the  partition  Let  N(S)  be  the  number  of  sequences 

z  in  the  set  e  ^>, 

Definition  2.2.1  Let  P(S)=  Z  p(z).  Then  P  (S)  =  N(S)  p  (z), 


w 


z  e  S 


w 


w 


w 


ince  p  (z)  is  constant  for  each  z  e  S. 


w 


Theorem  2.2.1  If  7  =  (x,y)  is  a  continuation  point  or  a  boundary 
point  in  any  sampling  plan  of  size  n,  then  any  polynomial  of  the  form 


>  .  .  ^  -  '  ' 


k  =  0,1,2, ... ,min(x-l,y) 


k+1 


y-k  x-k-1 
i  p 


(b)  Ptk 


k  =  1,2,... ,min 


can  be  estimated  unbiasedly,  provided  that  a  path  with 
exists . 


(x,y) 

k  (1,0) 


joins 


Proof:  We  shall  prove  that  a  polynomial  of  type  (a)  can  be  estimated 
unbiasedly.  The  proof  for  a  polynomial  of  the  type  (b)  is  similar. 


First  let  y  =  (x,y)  b 


with  k  (1,0)  joins  to  (x,y) 
°  kQ+l  y-k 

of  such  a  path  is  p^ 

a  sequence  z  e  S^,,  for  some 


e  a  boundary  point  and  suppose  a  path 

exists  and  ends  in  a  one.  The  probabilit 
x-k  -1  k 

Pg  q^  This  path  will  represent 

e  Let  f(S)  =  for  S  **  S . 

-  0  otherwise. 


Then  f ( S )  is  the  required  estimator,  for 


Now  suppose  7  =  (x,y)  is  a  continuation  point  and  that  a  path 
with  kQ  ( 1 , 0)  joins  exists  to  the  point  y.  Let  1  be  the  partition 
obtained  from  the  sampling  plan,  when  the  origin  has  been  translated 
to  the  point  /  =  (x,y).  Let  us  consider  all  those  sequences  z  e 
representing  paths  which  pass  through  the  continuation  point  (x„y) 
and  have  k^  (1,0)  joins  up  to  the  point  (x,y).  By  assumption,  there 


■ 


■  If  -  '  . 
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exists  at  least  one  such  sequence.  Each  such  sequence  z  can  be  obtained 

from  a  sequence  z'  e^'  by  adjoining  to  z'  a  path  going  to  the  point 

7  =  (x,y)  with  k  (1,0)  joins.  Let  us  call  the  class  of  sets  containing 

°  ko  C 

the  above  sequences  C  .  Thus  to  each  set  s'  e  S',  there  corresponds 
k  °  ^ 

a  set  S  and  conversely.  Let  N(S  '  )  be  the  number  of  sequences 

k 

in  S'.  For  each  S  €  S  °  let  N'(s)  =  N(s'),  where  s'  is  the  set 

k 

in  corresponding  to  the  set  S  eD  .  Thus,  N(s')  =  N'(s)  <  N(s). 

Let  f(s)  =  Ijs^  ’  s  e  0 

=  0  otherwise. 

Then  f(S)  is  the  required  estimator,  for 


E  (t (S  ))  = 


s  eS 


ELCSJ.  P  (s) 

N(S) 


y  isa*  ■<•>*.> 


S  e  \  z  e  S 


S 

y  y  n' ( s,  pw(z) 

S  eSk°  *  €  S 


„  7-*.  „  n,  y 

s'€S' 


P1  ~  V  ~  p2  ~  q2 


kQ+l  y-kQ  x-k0-l  kD 
P1  ql  p2  q2  ’ 


The  next  theorem  is  similar  to  De  Groot's  Theorem  8.2  [6], 
which  states  that  if  the  boundary  of  a  sampling  plan  of  size  n  contains 
more  than  n  +  1  boundary  points  then  the  plan  is  not  complete.  The 
method  of  proof  is  similar  to  that  of  De  Groot  and  we  state  the  proof  very 


briefly. 
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Theorem  2.2.2 

(n  +  l) (n  +  2) 

2 


If  the  minimal  sufficient  statistic  takes  on  more  than 
values  for  any  sampling  plan  of  size  n,  then  the  plan 


is  not  complete. 


Proof:  Pw(S  )  is  a  polynomial  in  p^  and  p^  of  degree  at  most  n.  Hence, 

the  expectation  of  any  estimator  is  such  a  polynomial.  Thus,  the 
expectation  operator  maps  the  space  of  estimators  into  the  space  of 
polynomials  in  p^  and  p^  of  degree  at  most  n.  Since  the  dimension  of 
the  space  of  polynomials  in  p^  and  p^  of  degree  at  most  n  is 
l.n  +  .1  l(n  +  .2)  *  then  the  dimension  of  the  space  of  polynomials 


estimable  unbiasedly  is  <  ^  +~  ^ .  Since  (P  (S  ) ,  S 


2  '  w 

the  space  of  estimable  polynomials,  then  if  there  are  more  than 


spans 
(n  +  1 ) (n  +  2) 


polynomials  P  (S  ) ,  they  must  be  linearly  dependent. 

w 


In  section  1.3  and  1.1+,  we  have  already  established  that  not 
all  polynomials  in  p^  and  p^  of  degree  at  most  n  are  estimable 
unbiasedly,  since  in  both  the  distributions  and  A  the  polynomials 

p  ,  j  =  l,2,...,n  were  not  estimable.  This  situation  extends  to  all 
finite  sampling  plans,  i.e.  in  two  parameter  sampling  plans  not  all  the 
polynomials  of  degree  n  in  p^  and  p^  are  estimable  unbiasedly. 

Thus,  the  determination  of  the  polynomials  estimable  unbiasedly  in  the 
two-parameter  case  is  a  great  deal  more  complicated  than  in  the  one  para¬ 
meter  case.  With  Theorem  2.2.2  and  the  fact  that  not  all  polynomials 
in  p^  and  p^  of  degree  n  are  estimable  unbiasedly^  we  have  established 
the  results  analagous  to  the  first  two  theorems  of  De  Groot  as  stated  in  the 


introduction. 
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We  shall  illustrate  the  usefulness  of  Theorem  2.2.1  by  considering 
a  class  of  sampling  plans,  C^.  Let  denote  the  simple  sampling  plans 

of  size  n  whose  boundary  consists  of  the  straight  lines 


(1)  parallel  to  the  axes. 

(2)  a  part  of  the  line  x  +  y  =  n 
(see  Figure  2).  Such  a  sampling 
plan  can  be  represented  by 
(a,b)n>  where  a,b  are 
defined  as  follows: 


F'q-  a 

Among  the  n  +  1  boundary  points  belonging  to  C^,  let  us  dis¬ 
regard  those  that  lie  on  the  line  x  +  y  =  n.  Of  the  remaining  boundary 
points,  let  a(b)  form  the  straight  line  parallel  to  the  x(y)  axis, 
a  +  b  <  n  -  1  a  >  0,  b  >  0,  i.e.  the  boundary  points  not  on  the  line 
x  +  y  =  n  lie  on  the  line  y  =  n  -  a  or  x  =  n  -  b.  The  "fixed"  sampling 
plan  of  size  n  is  represented  by  (0,0)n<  The  plan  in  Figure  2  is  (3,  % 


To  any  vector  (a,b)  ,  a  >  0,  b  >  0,  a  +  b  <  n  -  1,  ther< 


n 


corresponds  a  plan  and  conversely  i.e.  to  any  2  composition  of  the 


integers  n  +  1,  n,  ...,  2  corresponds  a  plan  of  C  .  The  number  of 


r-composit ions  of  n  being 


'11  -  1 
r  -  1 


,  the  total  number  of  plans  in  C  is 

n 


The  coordinates  of  the  boundary  points  for  the  sampling  plans  in 
are  (0,n-a)  (l,n-a)  ...  (a,n-a)(a  +  1,  n-a-l)  ...  (n-b,b),  (n-b,b-l) 
...  (n-b , l)(n-b ,0)  and  in  the  vector  notation  of  1.5,  these  plans  can  be 


• 
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denoted  as 


(a,  a-1,  a-2,  1,  0,  ...,  0,  1,  2,  ...,  b) 


For  these  plans,  the  polynomials  P  (S)  are  given  by  the  following 


w 


set,  denoted  by  Sn(a,b). 


(a) 


(b) 


(c) 


(a) 


n  -  a^  fx  -  1 \  _  k  n-a-k  r-k  k 


>  -  i/pi  qi 


p2  q2 


r  —  lj2,ooe,a 
k  =  1,2, . . . ,min(r ,n-a) 

n  -  a  -  j^^a+j  -  l^^k  ^o-a-j-k  _  a+j-k  ^  k 

k  =  1,2, . . . ,min(a+j ,n-a- j) 


k  -  1 


pi  qi 


J  —  1,2, .. .  ,n-a-b-l 


n-a 


-  j'N/a  +  j  -  l\  k+1  n-a-j-k  a+j-k-1  k 


V 


P1  ql 


j  =  1,2, . . . ,n-a-b-l 
k  =  0, 1,2, . . . ,min(a+j-l,n-a-j) 


s\  /n  -  b  -  1  \  k+1  s-k  n-b-k-1  k 


P1  ql  P2 


^2  ®  —  0, 1 , 2, . .  .  ,b 


k  =  0,1,2, ... ,min(n-b,s) 


(e)  q 


n-a 


The  set  of  polynomials  (a)  represents  paths  to  the  points  on  the  line 
y  =  n-a,  (b)  and  (c)  those  paths  to  the  points  on  the  line  x  +  y  =  n, 

(d)  those  paths  to  the  points  on  the  line  x  =  n-b,  and  (e)  the  path  to  the 
point  (0,n-a).  These  are  polynomials  in  p^  and  p^  such  that  the 
maximum  degree  of  p^  is  max(n-a,b+l)  and  the  maximum  degree  of  p^  is 
max(n-b-l,a) .  A  polynomial  with  the  term  p^  p^  or  p^  q  \  k  =j=  0 
cannot  be  obtained  from  the  above  set,  since  every  path  starts  with  probability 


Pi  or  cl1- 


. 
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Let 

max  (n-a,  b+l)  =  u. 

2.2.3 

max  (n-b-1,  a)  =  v. 


Thus,  the  set  S^(a,b)  consists  of  all  polynomials  in  and  p^  of 

maximum  degree  u  in  p^  and  v  in  p^,  u  +  v  <  n. 


Lemma  2.2,1 

fl'H**,1 


The  number  of  different  polynomials  in  the  set  S^(a,b) 


b+l 


+  1. 


Proof:  The  number  of  different  polynomials  in  the  plan  (0,0)  is 


n 


^+  1,  according  to  lemma  1.3.2.  Starting  with  the  plan  (0,0)^, 
we  choose  the  canonical  sequence  of  deformations  that  leads  to  the  plan 


'll  +  1 


(a,b)  . 
v  >  /n 


(o,  0,  . . . >  o) 
( 1 »  o ,  . •  ♦  >  o) 

t 

(&,  0 ,  . . • »  o) 

(a,  0,  . . • i  l) 


(a,  0,  ...»  0,  b) 

(a,  1,  0,  ...,  0,  b) 


(a,  a-1 ,  a-2,  1,  0,  ...»  0>  «••>  ^ ) 

(a,  a-1 ,  a— 2,  ...,  1,  0,  0,  ..»,  0,  1,  b ) 


(a,  a-1,  a-2,  . . . ,  1 »  0,  . . • ,  0,  1,  2,  ...,  b)» 


is 
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The  first  deformation  in  the  canonical  sequence  results  in  the 
loss  of  one  polynomial  due  to  the  fact  that  the  one  path  with  k  =  0  to 
the  boundary  point  (1,  n-l)  is  lost.  Similarly  the  second  deformation 
results  in  the  loss  of  one  path  (polynomial)  with  k  =  0  to  the  boundary 
point  (2,  n-2).  Thus,  the  first  "a"  deformations  result  in  the  loss 
of  "a"  paths.  After  the  first  "a"  deformations,  the  next  "b,!  deforma¬ 
tion  result  in  the  loss  of  1  path,  corresponding  to  k  =  1  to  the 
boundary  point  (n-l,  l).  The  next  "a-1"  deformations  result  in  the  loss 
of  1  path  (polynomial)  corresponding  to  k  =  1  to  the  boundary  point 

(2,  n-2),  the  next  "a-2”  deformations  result  in  the  loss  of  2  polynomials 

a 

and  so  forth,  until  we  have  performed  Z  i  +  b  deformations.  Similarly, 

1 

the  next  nb-l"  deformations  result  in  the  loss  of  2  polynomials  corres¬ 
ponding  to  k  -  1,2;  the  next  nb-2"  deformations  result  in  the  loss  of 

a  b 

3  polynomials  and  so  forth  until  after  E  i  +  E  j  deformations,  the 

a  b  i=l  j=l 

number  lost  is  E  i  +  E  j.  Thus,  the  number  of  polynomials  in  C  is 

i=l  j=i  n 


a  b 


i=l  j=l 


The  total  number  of  polynomials  can  also  be  obtained  by  counting 

the  number  of  polynomials  in  the  set  S^(a,b).  The  set  S^(a,b)  consists 

of  polynomials  in  p^  and  p^  of  maximum  degree  u  in  p^  and  of 

maximum  degree  v  in  p^  and  such  that  the  degree  of  p^  plus  the  degree 

0  k 

of  p^  is  less  than  or  equal  to  n.  Polynomials  of  the  type  p^ 

do  not  appear.  Since  a  +  b  <  n  -  1,  then  n  -  a  >  b  +  1  and  n  -  b  -  1  >  a. 

Thus,  according  to  2.2.3»  v  =  n  -  b  -  1  and  u  =  n  -  a.  Therefore,  the 


' 

rf3iw 


■ 
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n-a-b-1 

total  number  of  polynomials  in  S  (a,b)  is  (v+l)(u+l)-v-  Zi  , 

n  i=0 

where  the  latter  sum  represents  the  number  of  those  polynomials  in  p^  and 

p2  whose  degree  is  greater  than  n,  since  it  is  readily  verified  that 

n-a-b-1 

(v  +  l)(u  +  1)  -  v  -  Y  i  „  (n  -  b)(n  -  a)  +  1  - 

i=0 


f o.  +  l 

V  2  , 


a  +  1 


'd  +  1 


+  1 


Theorem  2.2.3  The  sampling  plans  are  complete. 

Proof ;  By  Theorem  2.2.1,  we  can  estimate  unbiasedly  the  following  polynomials “ 


1  P1  qiPl  qi  P1 


u-1 

qi  pi 


P1P2  P1P2 


qlPlP2  qiPlP2 


P1P2 

2  3 

qlplp2 


plp2 


V 


qlPlP2 


V 


u-1 


u-1 


ql  P1P2  ql  P1P2 


u-1  A 
ql  P1P2 


where  we  omit  all  polynomials  such  that  the  degree  of  p^  plus  the  degree 

of  p^  is  greater  than  n.  The  above  set  of  polynomials  is  linearly  independent, 

/  ,  (n-a-b-l)(n-a~b)  /n  +  l\  /a  +  l\  A  +  l\ 

contains  (v  +  l)u  +  1  -  A - - L  =  f  2  J  -  (  2  J  -  (  2  /  +  1 

polynomials  and  spans  the  space  of  estimable  polynomials.  Since  the 

^  1^)  “  (a  2  2  lN)  +  1  Polynomials  PW^S)  G  Sn(a»b)  also  sPat* 

the  space  of  estimable  polynomials,  they  must  be  linearly  independent.  Thus, 


the  theorem  is  proved. 
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We  finally  remark  that  the  class  of  sampling  plans  C  contains 

n 

the  sampling  plans  and  A  discussed  in  1.3  and  1.1;,  since  the  plans 

A  represent  the  case  where  a  -4-  b  =  0  and  the  plans  A  ,  the  case 
a  +  b  =  n  -  1.  Thus,  we  have  obtained  a  generalization  of  the  results  in 
these  sections.  As  usual  by  putting  p^  =  p^  =  p  in  Theorem  2.2.3,  we 
obtain  the  result  that  all  polynomials  in  p  of  degree  at  most  n  are 
estimable  unbiasedly,  thus  once  again  obtain  a  generalization  of  De  Groot’s 
Theorem  8.2  [6]  to  the  two  parameter  case. 


Theorem  2.2.1  enables  us  to  give  a  method  for  constructing  the 
basis  for  the  linear  space  of  estimable  polynomials  for  a  given  sampling 
plan.  Consider  the  set  of  boundary  and  continuation  points  given  by  the 
plan.  To  each  of  these  points  7  =  (x,y)  there  exists  at  least  one  path. 
Corresponding  to  this  path  is  a  polynomial  with  degree  of  P-j  =  y  +  1,  (y) 
and  with  degree  of  p^  =  x  -  1,  (x)  if  the  path  ends  in  a  horizontal 
(vertical)  step.  By  Theorems  2.2.1  each  such  polynomial  is  estimable. 

Retain  all  those,  polynomials  that  are  of  different  degree  in  or  p  . 

If  the  number  of  polynomials  retained  is  equal  to  the  total  number  of  different 
polynomials  to  each  boundary  point,  then  we  have  a  basis  for  the  space  of 
estimable  polynomials. 

Let  7^,  7  ,  ...,  7^,  r  >  n  +  1  be  the  boundary  points  of  a  sampling 
plan  of  size  n.  Let  k^,  k  ,  ...,  k^  represent  the  number  of  different  poly¬ 
nomials  (paths)  to  the  boundary  points  7^,  7  y  ...,  7^ •  Let  Z  be  the  space  of 
polynomials  spanned  by  the  k.,  +  k^  +  ...  +  k^  polynomials.  Then  Z  is  the  linear 
space  of  estimable  polynomials.  If  the  dimension  of  Z  <  k^  4-  k^  +  ...  +  k^_, 


' •<  »<fo. 

‘ 

. 
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then  clearly  the  sampling  plan  is  not  complete.  On  the  other  hand,  if  the 
dimension  of  Z  =  +  k^  +  . . .  +  k^ ,  then  the  plan  is  complete. 

I 

We  remark  that  in  the  case  of  one  parameter  sampling  plans  of 
size  n  the  dimension  of  Z  is  n  +  1,  since  every  polynomial  of  at  most 
degree  n  is  estimable  unbiasedly.  In  the  two  parameter  case,  not  every 
polynomial  in  p^  and  p^  of  degree  n  is  estimable  unbiasedly  and  thus 
the  dimension  of  Z  is  not  easily  determined  even  for  simple  sampling  plans. 
We  present  a  partial  solution  to  this  problem  in  the  next  section. 


2.5  Upper  Bounds  for  the  Number  of  Estimable  Polynomials  in  the  Basis  of  the 

Linear  Space  of  Estimable  Polynomials  in  Simple  Sampling  Plans. 


Since,  in  general,  the  number  of  different  k  values  to  any 
particular  boundary  point  determines  the  total  number  of  values  the  sufficient 
statistic  takes  on,  it  would  be  useful  if  one  could  determine  this  number 
without  the  tedious  process  of  counting.  By  using  the  idea  of  the  canonical 
sequence  of  deformations  defined  in  1.5,  we  can  obtain  this  number  for  a 
large  class  of  simple  sampling  plans  and  obtain  an  upper  bound  for  this 
number  for  all  simple  sampling  plans.  Since  to  each  path  with  k  (1,0) 
joins  and  ending  in  a  vertical  or  horizontal  step  there  corresponds  a 
polynomial  Pw(s)  and  conversely,  we  refer  to  either  paths  or  polynomials 
since  this  causes  no  confusion.  By  lemma  1.5.2,  we  know  that  the  number  of 
different  polynomials  in  the  "fixed”  size  sampling  plan  of  size  n  is 
(n  +  AKn-+._?.I  +  1 '  By  using  the  canonical  sequence  of  deformations,  we 
can  determine  how  many  "k"  values  are  gained  or  lost  at  each  stage  of 


the  sequence  i.e.  we  can  count  the  number  of  "k"  values  in  each  sampling 


* 
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plan  in  the  canonical  sequence.  In  order  to  facilitate  this  counting,  we 
shall  classify  the  sampling  plans  defined  by  1.5.2  into  various  classes. 

As  a  first  step,  we  propose  to  introduce  a  particularly  simple  class  of 
plans  called  basic  plans  and  then  consider  two  further  classes  of  plans 
which  we  shall  call  "proper"  and  "regular".  In  what  follows,  we  shall  use 
the  vector  notation  of  definition  1.5.2.  We  recall  that  the  boundary  points 
denoted  by  a^,  a^,  . .  .  ,  a, __  1  in  this  vector  notation  refer  to  displacement 


k-1 

parallel  to  the  y  axis  (i.e.  on  the  left),  while  the  boundary  points 

,  ...,  an+^  refer  to  displacement  parallel  to  the  x-axis, 
(i.e,  on  the  right). 


denoted  by  a 

J+l 


Definition  2.5.1  A  sampling  plan  described  by  the  vector  (a^ ,  a^,  ...»  a  ^) 

is  said  to  be  a  basic  sampling  plan  if  all  the  a^  =  0  except  possibly 

a..  or  a  . 

1  n+1 

A  basic  sampling  plan  has  all  its  boundary  points  on  the  line 
x  +  y  =  n  except  possibly  the  first  and  the  last.  These  plans  are  an 
intermediate  step  between  any  simple  sampling  plan  of  size  n  and  the 
"fixed"  size  sampling  plan. 


Definition  2.5.2  A  sampling  plan  described  by  the  vector  (a^,  a^,  ...,  a  ^ 
is  said  to  be  regular  if  the  components  a^^  satisfy  the  following  conditions. 


and  let 


(a')  condition  (a)  of  1.5.2. 

(b')  Let  k  be  the  smallest  integer  i 
j,  j  >  k  +  1,  be  the  largest  integer  i 

a.  >  a_  >  ...  >  a  >  0  and 
12  k-1 

(c5)  condition  (c)  of  1.5.2. 


such  that 

a  =  a  =  0 

l  l+l 

such  that 

a^  =  0.  Then 

<  a 

^  o  •  o  ^  cl  ~  • 

j+l  j+2 

n+1 
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A  regular  sampling  plan  is  one  in  which  the  non-zero  components 
are  strictly  increasing  or  decreasing.  To  put  it  another  way,  except  for 
the  zero  components  no  two  adjacent  components  are  equal.  These  sampling 
plans  have  the  property  (in  the  case  of  the  two  parameter  distributions) 
that,  except  for  boundary  points  corresponding  to  some  zero  components, 
only  paths  that  end  in  a  vertical  or  a  horizontal  step,  but  not  both,  exist 
to  each  boundary  point  corresponding  to  a  non-zero  component. 


Definition  2.5.5  A  sampling  plan  is  irregular  if  it 

Definition  2,'5.k  A  sampling  plan  is  proper  if 


(a) 

a  .  < 

J  “ 

i 

3 

ro  + 

i— * 

-  j  +  1 

j  =  1 , 2 ,  .  .  .  ,  k  'i 

(b) 

VI 

n  +  1 

2 

-  (n  +  1  - 

i)  i  =  k  +  2, 

is  not  regular. 


and 


0  0  9  ,  |  1  0 


Definition  2.5.5  A  sampling  plan  that  is  not  proper  is  improper. 


A  proper  sampling  plan  is  one  in  which  all  the  boundary  points 


not  on  the  line  x  +  y  =  n  with  x 


=  pL±-l|  o  Similarly  the  boundary  points  with  x  > 


lie  on  or  above  the  line 
n  +  1 


to  the  right  of  the  line  x  = 


2-±- i-1  . 


lie  on  or 


In  a  proper  sampling  plan,  there 


is  no  "interference"  with  the  number  of  paths  to  any  boundary  point  with 

x  <  [~n-~ ■  — 1  by  boundary  points  below  it  and  similarly  to  any  boundary 

n  +  1 


point  with  x  > 


by  points  to  the  left  of  it 


With  this  classification  of  sampling  plans,  we  can  now  proceed 
to  determine  the  number  of  polynomials  in  the  basis  for  the  linear  space 
of  estimable  polynomials  in  each  sampling  plan.  First,  we  determine  this 
number  for  basic  sampling  plans  of  definition  2.3.1.  Then,  using  the  concept 


* 

. 

' 

* 

J 

■ 
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of  canonical  deformation,  we  determine  this  number  for  all  regular,  irregular 
and  proper  plans.  This  method  also  enables  us  to  determine  an  upper  bound 
for  the  remaining  class  of  improper  plans. 


Lemma  2,3.1  The  number  of  different  polynomials  in  a  basic  plan  of 
size  n  is  given  by 


(1) 

(2) 
(3) 


n(n  +  l) 

— “ —  +  1 


E  _  — +  1  -  a  if  a  =  0 
n  2  1  n+1 


n(n  +  l) 

2 


■  al 

if 

a  ^ ^  _ *  1,2, ,9,, n  —  d. 

and  a^ 

n-1 

+  1  - 

a, 

1 

-  a  if  a  = 

n+1  n-1 

n-1  or  a^ 

=  n 

Proof :  Let  us  consider  the  canonical  sequence  of  deformations  from 

(0,  , 0)  to  the  plan  (a^,  0*  « «  »  »  0,  an+^)°  The  number  of  different 

polynomials  in  the  plan  (0,  . ,,,  0)  is  t  ^ ^  +  1  by  Lemma  1.3„1, 

Consider  case  (l).  The  first  deformation  moves  the  boundary  point 

(0,n)  to  the  boundary  point  (0,n-a^).  Every  path  to  the  boundary  point 

(n-r,r),  r  >  0,  having  k  =  0  (1,0)  joins  must  pass  through  the  point 

(0,r)  and  ends  with  horizontal  steps  from  (0,r)  to  (n-r,r),  Thus, 

it  is  clear  that  a-  polynomials  have  been  lost  and  E  =  — L — - — -*•  +  1  -  a 

i  n  2 

Now  consider  cases  (2)  and  (3)-  The  first  a^  deformations  result  in 
the  loss  of  a ^  polynomials  as  in  case  (l).  If  /  n-1  and  a  ^  = 
l,2,..,,n-2  one  more  polynomial  is  lost;  the  one  corresponding  to  the  path 
with  k  =  1  (1,0)  joins  and  ending  in  a  vertical  step  to  the  point 
(n-1,1).  However,  if  a^  =  n-1  a  further  an+1  ~  1  polynomials  are  lost 
corresponding  to  the  paths  with  k  =  1  (1,0)  joins  and  ending  in  a 
vertical  step  to  the  points  (n-2,2),  (n-393)>  • » • »  (n~a  »an)»  A  similar 
argument  applies  if  a^+1  =  n-1.  Hence  the  lemma  is  proved. 


f 

» 

' 


■ 

* 

' 


Before  we  proceed  to  the  next  theorem,  we  shall  give  an  informal 


explanation  of  the  ideas  behind  the  proof .  We  recall  that,  in  a  regular 
sampling  plan,  to  all  the  boundary  points  corresponding  to  the  components 
a^>  a^,  .  ..,  a^  ^  there  exist  only  paths  that  end  in  a  vertical  step. 
Similarly,  to  all  the  boundary  points  corresponding  to  the  components 
aj+l*  an+l  only  Pat^s  that  and  in  a  horizontal  step  exist.  From  1,2, 

we  know  that  the  possible  k  values  for  paths  ending  in  a  vertical  step 
range  from  1  to  min(x,y)  and  for  those  ending  with  a  horizontal  step 
range  from  0  to  min(x-l,y).  Therefore,  if  a  plan  is  proper,  we  know 
that  min(x,y)  =  x  and  min(x-l,y)  =  x-1  for  all  boundary  points  to  the 
left  of  the  line  x  = 

2  ’J  '  l 

n  +  1 


.  Similarly^  min(x-l,y)  =  y  =  min(x,y)  for 


all  boundary  points  to  the  right  of  the  line  x  = 


Moreover,  in  a 


proper  plan,  k  takes  on  all  the  values  from  1  to  x  for  all  boundary 

n  +  1 


points  7  =  (x,y)  to  the  left  of  the  line  x 


,  (except  for  the 


point  (0,y))  and  all  the  values  from  0  to  y  for  all  boundary  points 


to  the  right  of  the  line  x  = 


—  1  (except  for  the  point  (x,0)).  If 


we  consider  a  regular,  proper  plan,  it  is  easily  seen  that  in  the  canonical 
sequence  of  deformations  leading  to  this  plan,  all  the  plans  are  regular 
and  proper.  Thus,  we  need  only  concern  ourselves  with  the  counting  of  one 
type  of  path  at  each  stage  of  the  sequence  (ending  either  in  a  horizontal 
or  vertical  step)  and  we  know  how  many  k  (1,0)  joins  exist  for  these 
paths.  The  following  example  is  given  to  help  clarify  these  ideas. 

Consider  the  canonical  sequence  of  deformations  leading  to  the  plan 


art  . 

-• 


-  45  - 


j 

1 
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t 

r 
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K - 
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- j 

The  P k ^  S? 

Fl  j  ■  3 


S'  = 


(4,  2,  1, 

(0,  0,  0, 
(1,  0,  0, 
(2,  0,  0, 
(3,  0,  0, 
(4,  0,  0S 
(  4  JD  0  ,  0, 

( 4  J  0  ,  0  , 

(  4  j  1  s  0  3 
(4,  2,  0S 
(4,  2,  ls 

(  4  3  2  ,  1  s 


0,  0,  0, 

0,  0,  0, 
0,  0,  0, 
os  0,  0, 

0  ,  0  ,  0  3 

0,  o,  0, 
0,  os  0, 

0  3  0  3  0  3 

O3  0,  o, 

0#  O3  0, 

0,  0,  os 

0,  O3  O3 


0,  X,  2). 

0,  0,  0) 
0,  0,  0) 
0,  0,  0) 
0,  0,  0) 
0,  0,  0) 
0,  0,  1) 
0,  0,  2) 
0,  0,  2) 
0,  0,  2) 
0,  0,  2) 
0,  1,  2). 


According  to  lemma  2.3. 1,  after  the  number  of  polynomials  is 

o  q 

-  4  =  32 „  After  S  ,  there  are  31  since  all  the  paths  ending  with 
a  horizontal  step  to  the  boundary  point  (2,6)  are  lost.  Since  we  have 
already  taken  into  account  the  paths  with  k  =  0  (1,0)  joins,  we  need 
only  count  as  lost  those  paths  with  k  =  1  (1,0)  joins  to  the  point  (2,6) 
After  Sg  (i.e.  after  moving  one  step),  there  are  still  31,  since  any 
further  deformations  on  the  line  x  =  1  do  not  affect  any  other  k  values 
to  any  other  boundary  point.  In  the  proof  of  Theorem  2*3.1  we  will  utilize 
the  fact  that  we  need  count  losses  of  polynomials  only  at  certain  stages  of 
the  canonical  sequence  of  deformations.  Similarly,  after  S  2  more 
polynomials  are  lost  and  one  more  is  lost  after  S  .  Thus,  the  total 
number  of  polynomials  in  S’  is  27  and  this  number  can  easily  be  verified 
by  actual  counting.  With  this  discussion,  we  can  proceed  to  the  theorem. 


' .  I 
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Theorem  2.3.1  Let  (a^  a^,  . a 


0  0  °  °  0  a.  , ,  . . . ,  an+1)  =  S 


2*  ””  ~k-l’  -  -  “j+1 

represent  a  regular,  proper  sampling  plan  of  size  n  (j  >  k+1  and  a.  f  o). 

J+1 

Then,  the  number  of  different  polynomials  in  the  plan  Sf  is  given  by 

k-2  n- j+1 

a  =  e  -  (y  i  +  y  , 

i=l  JU2 


n 


n 


where  E  is  the  number  of  Lemma  2.3, 1» 
n 


Proof:  Let  us  consider  the  sequence  of  canonical  deformations  from  the  plan 

(0,  . .  .  ,  0)  to  S’.  After  the  first  a,  +  a  ,  deformations  we  arrive  at 

1  n+1 

a  basic  plan,  according  to  definition  2.3* 1*  By  lemma  2,3.1,  the  number  of 
polynomials  is  given  by  E^.  The  next  a^  deformation  result  in  the  loss 
of  1  polynomial,  corresponding  to  paths  with  k  =  1  (1,0)  joins  and 

ending  in  a  horizontal  step  to  the  point  (3>n"3)«  Similarly,  the  next 
a  deformations  result  in  the  loss  of  2  more  polynomials  corresponding 

3 

to  paths  with  k  =  1  and  k  =  2  (1,0)  joins  and  ending  in  a  horizontal 

k-2 

step  to  the  point  (4,n-4).  We  proceed  in  this  manner  until  E  a 

i=2  1 


deformations  have  been  performed  losing  3»^» • • • »k-2  polynomials  at  each 


stage.  Since  these  plans  are  regular  and  proper,  no  more  polynomials  are 

n 

lost.  Similarly,  after  performing  the  next  E  a,  deformations  we 

k=  j+1 

lose  2, 3, . . . ,n- j+1  at  each  stage  corresponding  to  the  k  values  for  all 

paths  ending  in  a  vertical  step  to  the  boundary  points  (n-3,3)  ...  (n-(n-j+2), 

n-j+2).  Since  the  plans  are  always  regular  and  proper,  no  more  polynomials 

A-2  n- j+1  v 

are  lost,  yielding  A  -E  -  (  E  i  +  E  J  )  . 

n  n  \i=l  i=2  / 


Corollary.  If  S  is  a  regular,  improper  sampling  plan,  the  number  A^ 
is  an  upper  bound  for  the  number  of  polynomials  in  S. 
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Proofs  ff  S  is  regular  and  improper  at  least  the  number  of  polynomials 
as  stated  in  the  theorem  is  lost0  In  fact,  more  polynomials  can  be  lost 
at  some  stages  in  the  canonical  sequence  due  to  the  presence  of  boundary 
points  below  or  to  the  right  of  the  boundary  points  in  question.  Clearly, 
is  an  upper  bound. 

We  now  turn  to  the  consideration  of  irregular  sampling  plans. 

We  have  already  seen  that  in  improper  plans  more  polynomials  can  be  lost 
due  to  the  '* inter ference"  of  other  boundary  points.  In  irregular  plans, 
at  certain  stages  of  the  canonical  sequence,  polynomials  can  be  regained 
because  of  the  "irregular*'  nature  of  these  plans.  Before  we  prove  the 
next  theorem,  we  shall  give  an  informal  discussion  of  some  of  the  ideas 
motivating  the  proof  and  give  several  examples  to  clarify  the  situation. 

In  the  canonical  sequence  of  deformations  to  a  regular  plan,  all  the  plans 
at  each  stage  of  the  sequence  are  regular.  In  regular  plans  only  paths 
that  end  in  a  vertical  step  exist  to  boundary  points  corresponding  to  the 


components  a^,  a^, 


, . . ,  a^  ^  and  only  paths  that  end  in  a  horizontal  step 


exist  to  boundary  points  corresponding  to  the  components  a 


n+1 


>ls  3+2*  e°*> 

Moreover,  if  the  plan  is  proper,  then  all  the  plans  in  the  canonical 


sequence  are  also  proper,  and  as  we  have  seen  in  Theorem  2,3d,  only  paths 
ending  in  a  vertical  (horizontal)  step  are  lost  if  the  deformations  are 


performed  parallel  to  the  x(y)  axis.  However,  if  a  plan  is  irregular  then 
both  types  of  path  can  exist  to  any  boundary  point  and  the  simple  situation 
of  Theorem  2,3d  no  longer  occurs.  In  fact,  some  of  the  paths  that  were 
counted  as  lost  may  reappear  as  soon  as  an  irregular  plan  occurs  in  the 
canonical  sequence  of  deformations.  We  can  decide  how  many  paths  reappear 
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n 

by  recalling  that  in  performing  the  last  Z  a,  deformations  only  paths 

J+l  1 

whose  last  step  is  vertical  can  reappear ,  while  in  performing  the  first 
k-1 

Z  a .  deformations  only  paths  whose  last  step  is  horizontal  can  reappear. 

2  1 

If  these  plans  are  also  proper,  all  the  k  values  (k  =  1,2, . „ , ,min(x,y) ) 
corresponding  to  paths  ending  in  a  vertical  step  can  reappear  but  not  all 
paths  ending  with  a  horizontal  step  may  reappear.  In  particular,  paths 
with  k  =  0  (1,0)  joins  and  ending  in  a  horizontal  step  will  not  reappear 

if  there  is  a  boundary  point  or  inaccessible  point  to  the  left  of  the  boundary 
point  in  question.  Thus,  if  a^  represents  the  coordinate  of  the  vector 
at  which  point  an  ’’irregularity'*  occurs,  we  can  consider  the  two  cases  in 
which 

(1)  no  boundary  or  inaccessible  point  lies  to  the  left  of  the 
boundary  point  on  the  line  y  =  (n-i-fl)“<a^  and 

(2)  at  least  one  boundary  or  inaccessible  point  lies  to  the  left 
of  the  boundary  point  on  the  line  y  =  (n-i+l)  -  a  .  Case  (l)  will  occur 


if 

a  +  r  <  a ,  +  i 
r  1 

for 

all  r  =  1,2, . 0 . ,i-l 

and  case  (2)  will  occur 

if 

a  4-  r  >  a.  +  i 
r  “i 

for 

at  least  one  r,  r  = 

0 

4 

1 

• 

• 

e 

OJ 

r—i 

Theorem  2.5.2  Let  (a^.,  a  ,  .  .»,  a^  0  •••  0,  a 

represent  an  irregular,  proper  plan  of  size  n.  Then  the  number  of  different 
polynomials  in  S  is  given  by 


where  Z1  is  a  positive  integer  determined  as  follows: 

Z"  is  obtained  by 

(i)  adding  together  the  indices  i,  i  =  1,2,..., k-2  such 


j+l* 


an+ 1  ^ 


' 

.  •  I 

■ 

« 

■  . 
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that  a^  =  and  the  Indices  4,  4  =  l,2,...,j-l  such  that  a^  ^  = 

an-4+2  an<^  t*ien  (*0  subtracting  one  for  each  i  such  that  a^  «  a,  ^ 

1  *  1,2,..., k-2,  if  there  is  at  least  one  r,  r  =  l,2,...,i-l,  such 

that  ar  +  r  >  a  +  1,  Note  that  in  case  (l),  step  (ii)  is  not  necessary. 

( iii)  subtract  1  for  each  i  such  that  aJ  =  a,  , ,  if  a,  ,  +  i  >  a,  and 

i  i+1*  i+1  1 

for  each  r  <  i,  a^  +  r-1^  a  ^  +  i. 

We  now  give  2  examples  to  illustrate  the  computation  of  2'. 

Consider  the  irregular,  proper  plan  of  size  11  (3, 3, 3, 2, 1,0, 0,1, 1,2, 4, 4, ) . 

Clearly  there  are  no  boundary  points  or  inaccessible  points  to  the  left 

of  the  boundary  point  corresponding  to  a_  and  2'  =1+2+1+!  -1  -  1=6. 

2 

If  we  take  the  irregular,  proper  plan  of  size  11  (5,3,3,1,1,0,0,1,1,3,14,4), 

it  is  easily  seen  that  we  are  in  case  (2)  and  2'  =  2  +  4+  l  +  4-  1-1=9* 

/k-2  n-j+1  \ 

Proof  of  Theorem  2.3.2;  The  fact  that  we  can  lose  (  2  i  +  2  4  )  poly¬ 
pi*  1  4*2  / 

nomials  can  be  established  in  the  same  way  as  in  Theorem  2.3*1*  From  the 
discussion  preceding  the  theorem,  it  is  seen  that  if,  at  any  stage  of  the 
canonical  sequence  an  irregular  plan  is  reached,  then  all  the  polynomials 
corresponding  to  paths  ending  in  a  horizontal  or  vertical  step  are  added 
in  case  (l)  and  in  case  (2)  all  of  these  paths  are  added  except  for  the 
paths  corresponding  to  k  =  0  (l»0)  joins.  Case  (iii)  takes  care  of  paths 
with  k  s  0  (1»0)  joins  that  are  lost  and  not  counted  by  E  .  Thus,  the 
number  of  polynomials  in  $  is  B  . 

Corollary.  The  number  B^  is  an  upper  bound  for  the  number  of  polynomials 


in  an  irregular,  improper  sampling  plan  of  size  n. 


k-1 


Proof:  Because  the  plan  is  improper  and  irregular,  at  least  2  i  + 

isl 

n-j+1 

2  4  polynomials  are  lost  and  at  most  2*  polynomials  are  added. 

4=2 


. 


.1 


. 
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4  Domination  of  Sampling  Plans  and  the  Number  of  Estimable  Polynomials 
in  the  Plan. 

In  1„5»3  we  defined  a  sampling  plan  of  size  n  as  the  vector 
=  (a^>  a2>  •  ••»  an_2.^  where  the  a^  satisfied  certain  conditions.  It 
can  be  shown  that  a  regular  sampling  plan  can  also  be  expressed  as  a  vector 
=  (a-^>  a2>  •••>  an_^)»  where  the  a^  are  non-negative  integers 
satisfying  the  following  conditions: 

(l5)  Let  j  be  the  first  integer  i  such  that  a^  >  0, 

2.4.1  i  =  l,2,...,n-l.  Then  0  <  a.  <  a.  a 

J  J+l  n-1 

(2' )  a.  <  2i. 

It  is  clear  that  definition  1.5.4  of  domination  of  vectors  applies 
to  the  vectors  satisfying  conditions  2.4.1  above.  We  seek  to  establish  some 
relationship  between  dominated  sampling  plans  and  the  number  of  estimable 
polynomials  in  such  plans.  The  following  theorem  establishes  such  a  relation¬ 
ship.  Since  a  rigorous  proof  of  this  theorem  would  require  a  lengthy 
rewriting  of  some  previous  definitions,  we  shall  given  an  informal  proof 
indicating  how  the  definition  of  canonical  sequence  of  deformations  can  be 
modified  to  yield  the  required  result.  Let  n(A)  represent  the  number 
of  different  estimable  polynomials  in  the  sampling  plan  represented  by  the 
vector  A. 

Theorem  2.4.1  If  A  d  B  ,  and  A  and  B  are  both  regular,  then 
- - -  n  n  n  n 

n(A  )  <  n(B  ) . 

Proof:  In  the  definition  of  the  canonical  sequence  of  deformations,  we 

made  use  of  the  fact  that  every  sampling  plan  of  size  n  dominates  the 


' 
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"fixed"  sampling  plan.  We  can,  in  the  same  way,  define  a  canonical  sequence 

of  deformations  from  any  plan  B  to  any  plan  A  ,  provided  A  d  B  . 

n  n  n  n 

Since  both  A^  and  B^  are  regular,  the  canonical  sequence  of  deformations 
from  B^  to  A^  yields  a  regular  sampling  plan  at  each  stage.  Renee,  at 
least  a  certain  number  of  paths  may  be  lost  at  each  stage  and  n(A^)  £  n(Bn). 

We  now  give  several  examples  to  illustrate  the  theorems  in  this 
section  and  section  2.3«  Consider  A  =  (3,2, 1 ,0,0,0, 1 , 3, 3)  according  to 
definition  1.5.2  or  equivalently^  A  =  (0, 1,2, 3, 5,6,6)  according  to  definition 
1.5.3  and  similarly,  B  =  (3, 2, 1,0, 0,0, 1,2,3)  =  (0, 1 ,2, 3, 4, 5 >6) .  B  is  a 
regular  proper  plan  and  A  is  an  irregular,  proper  plan.  We  determine 
n(A)  and  n(B)  from  Theorems  2.3.2  and  2.3 . 1. 

n(A)  =  (^f)-  3  -  [(1  +  2)  +  (2  +  3)1  +  1  =  26 

n(B)  =  -  3)  -  [(1  +  2)  +  (2  +  3)1  =  25. 

It  is  readily  seen  that  n(A)  -  n(B)  =  1,  since  in  A  one  path  ending  with 
a  vertical  step  has  been  added  due  to  the  "irregularity"  as  the  boundary 
point  ag,  [(4,l)].  Thus  A  d  B,  but  n(A)  >  n(B).  Similarly  for  the 
irregular  proper  plans  C  =  (3,2, 1,0, 0,0, 1, 3,3,4)  ancl  D  =  ( 3,2, 1 ,0,0,0, 

1,2, 3, 3),  we  have 

n(c)  =  ft- -  ft)  -  [(1  +  2)  +  (2  +  3  +  U)]  +  2  =  32. 

n(D)  =  -  ft)  -  [(1  +  2)  +  (2  +  3  +  4)]  +  1  =  31. 

C  d  D  but  n(C)  >  n(D).  Thus  Theorem  2.4.1,  which  holds  only  for  regular 

plans,  cannot  be  extended  to  irregular  plans  in  general. 


* 
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2.5  Combinatorial  Theorems. 

In  this  section,  we  shall  present  several  theorems  which  have  been 
mentioned  but  not  fully  discussed  in  the  literature  [18].  In  [18],  Narayana 
proved  a  combinatorial  theorem,  analagous  to  the  multinomial  theorem,  which 
was  used  to  prove  various  results  concerning  vectors  with  non-negative 
integral  components.  This  approach  yielded,  as  a  special  case,  the  number 
of  simple  sampling  plans  of  size  n.  In  this  section,  we  present  a  theorem 
analogous  to  Narayana  [18],  from  which  we  shall  obtain,  as  a  special  case, 
the  number  of  regular  sampling  plans.  We  shall  also  mention  very  briefly 
some  interesting  results  that  do  not  pertain  directly  to  regular  sampling 
plans.  A  complete  discussion  of  the  interesting  combinatorial  aspects  of  these 
and  other  theorems  is  available  in  [19]. 


Theorem  2.5.1 


Let 


n,  x  , 


be  non-negative  integers. 


Define  recursively 


(0;  0  ...  0)  =  1 

(n;  x] ,x?, . . . ,xk)  =  0 


k 

if  2  x.  >  n 


Z  (n-1;  X--&.,  x  -5  , . . . ,x  -S  ) . 

cv  ^  1  1  C.  cL  rC  K. 

&.=0  or  1 
i 

i-l>2,,..,k  otherwise. 


The  sum  on  the  right  is  taken  over  2  terms.  Then 


(n;  x1,x2, 


i=l 


T  M 

r  E‘,  ,  x,-i 

^  1=1  1 

1  V*i  J 

n  +  1  - 

k 

if  E  x.  <  n 


=;  0  otherwise. 


We  note  that  (n;  x^,x2> 


can  be  expressed  in  determinant  form  as 


■ 
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( n  j  x  ^ , X2 » • • • » )  — 


Vi 

.X1  . 


n+1 


(n+l) 


n-x^+l  -Xp 


-x. 


-x. 


n-x^+l  . . 


0  0  0  9  0  0  0 


n-vi 


Proof :  That  (nj  x1  ,xo,  . .  „  sx^)  =0  if  Z  x_^  >  n  is  true  by  definition, 


V  2: 


i=l 


The  theorem  is  true  for  n  =  0,1  by  direct  verification.  We  assume  the 

theorem  is  true  for  all  positive  integers  n  <  m  -  1  (m  -  1  >  l).  Then, 
k 

with  Z  x.  <  m.  we  have 
i=l  1 


(m;  x^x  ,  .  .  .  ,xk)  =  Z 

£ 

i=l,2, . . . ,k 


k'  -  -  .  (m~1;  vV  vs2’  •••>  vV 

o,=0  or  1 

l 


We  can  assume  that  Z  (x.-5.)  <  m  -  1,  since  the  term  containing  Z(x„-&„)  = 


l  i 


m 


i  l 


i=l  i 

on  the  right  hand  side  is  by  definition  zero.  Then,  by  the  induction  assumption 


(m;  xl 


>  x  , .  .  .  , 


v  %  n  z .  Hi 

6,=0  or  1  f  . 

l  i=l 

i=l,2, . . . ,k 


m 

x.  -6 . 

v  l  \/ 


1  - 


k 

z  (vV- 

i=l  1  1 

m 


m  )  Z 

*1  5  ,=0  or  1 

i 

o «  •  «*lc 


k 

n 

i=2 


m 

x.-6. 

x  l  i' 


Z  (x.-o. ) 

x,  .  '  1  i' 

,  1  i=2 

1  ~  -  + 

ml  m 


m 

x 


k 

n 


X, 


Z 


m  \  /  ^  __1  i=2 

,  ,  „  .  .  --  Xx, -5 , /  V  m, 

V  6.=0  or  1  .  0V  1  i7  V  1 

i  i=2 

i=2 , o  o  o  , k 


<V5i). 


m 


3  1^ 


k- 1 

The  two  sums  on  the  right  consist  of  2  terms  and  correspond  to  putting 
=  1  and  5^  =  0  in  the  first  expansion.  Then  (m;  x^,x  , .  .  .  ,x^)  = 


r 


5.=0  or  1 

l 

i=2,3> . . . ,k 


n< 

i=2 


m 

X.-&. 
v  i  i' 


1  - 


x,  <V6A- 

1  1=2 


m 


m 


+  Z 
6.=0  or  1 


/m  ■ 

Vv1. 

k 

n  ' 


m 


m 


m 


i  i=2 

i=2 • °  °  s k 


V5i )  Vv1/  ra 


B.=0  or  1 

i 


i=2,3, . . . ,k 


B.=0  or  1 
i 


n 

i=2 


k 

n 

i=2 


m 

x.  -S. 

v  i  i' 


Z  (x. -5  ) 

1  _  ^1  _  i=2  1  1  X1 

m  m  m(m+l) 


m 

x.  -5. 

v  l  i' 


x. 


1  - 


i=2  1  1  ' 


irn-1 


m 


i=2 ,  3  >  •  •  •  » k 


Repeating  the  same  argument  for  i  =  2,3>*..»k  yields  the  theorem. 


Proceeding  in  a  similar  way  to  Theorem  2.5. 1>  we  can  easily  show 


We  can  combine  this  procedure  in  the  following  fashion.  Let  I  be  a 
subset  (i1>i2, . . . ,ir)  of  the  integers  1,2,... ,k  where  1  <  <  ig  < 

<  i  <  k.  Let  I  denote  the  complementary  set.  For  Z  x.  <  n,  we  denote 

r  —  'i 
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1  l 

the  sum  £ 


"*•  ^  C  ti ,  x  ji , , ,  j  x ,  —  & »  ,  x  —  8 ,  •  i  ( ,  >x ,  —  &  ,  x 

5  =0  6 ,  =0  1  ii  xi  ^  *2  ir  ir  ir  +  1 

ii  ir 

by  (n>  X^»X2» • ♦ • »xi1* • *^±r *  Xi  +l,'*‘,xk^*  an  ar8ument  similar  to 

Theorem  2.5.1,  we  can  establish  the  following  theorem. 


» •  • • > ) 


Theorem  2.5.2  For  Z  xi  <  n, 


(n;  x .  ,x_,  .  .  .  ,x*  .  .  .x*"  .  .  ,x,  ) 
1  2  ii  ir  k' 


From  the  definition  of  these  numbers,  it  is  obvious  that 

/  *  *  *  . 

(n ;  x....x.  . ..x,  ,  xv  ...x,) 

•i  i  i  .  i 

r- 1  r 


1 


(n; 

*  * 

x. , . ,x„  . . .X, 

1  i.  1 

1 

* 

•xi 

■v  X 

r-1 


We  can  associate  these  numbers  with  the  number  of  vectors  with  non-negative 
integral  components  satisfying  certain  conditions.  As  a  special  case,  we 
can  obtain  the  number  of  regular  simple  sampling  plans. 


2.6  Non-Negative  Increasing  Vectors. 

Let  a^,a^,...  denote  non-negative  integers.  For  n  >  1,  let  us 

consider  the  set  of  vectors.  A  =  (a;  a  =  (a, ,a  ,  ...,a  )},  when  the  a. 

n  Id  n  l 

satisfy  the  following  conditions: 

(0  0  <  ax  <  a^  <  ...  <  an. 

(2)  a  £  ki  i  =  1,2,.. . ,n 

2.6.1  1  ’ 

(3)  let  j  be  the  first  index  i,  i  =  l,2,„..,n  such  that 

a .  >  0 .  Then  0<a<a1<...<a. 

1  J  j+1  n 
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Condition  (3)  ensures  that  the  non-zero  components  of  the  vector  a  are 
strictly  increasing.  We  note  that  if  we  put  k  =  2  in  2.6.1,  we  obtain 
the  vectors  defined  by  2.U.1,  which  can  be  shown  to  be  in  a  1-1  corres¬ 
pondence  with  the  set  of  regular  sampling  plans. 

Let  us  now  consider  the  subset  S(n;  x  ,x  , ...,x  )  of  A  ,  where 

1  ^  k.  n 

S(n;  x,  ,x  , ...,x  )  consists  of  all  those  vectors  in  A  that  have  exactly 
1  2  k  n 

x^  of  their  positive  components  =  i  (mod  k),  i  =  l,2,...,k,  respectively , 
The  following  theorem  establishes  the  connection  between  the  number  of 
elements  in  S(n;  x^,x  , ...,x^)  and  the  numbers  (n;  x^ ,x^, . . . ,x^)  of 
Theorem  2.5.1. 


Theorem  2.6.1  The  number  of  vectors  in  S(n;  . . . ,x^)  is  given 

by  the  number  (n;  x^ ,x^, . . . ,x^)  of  Theorem  2.5.1. 

Proof:  The  theorem  is  proved  by  establishing  a  1-1  correspondence  between 


the 


sets  S(n;  ,x^, . . .  ,x^)  and  the  set  T  = 


6.-0  or  1 
1 

i=l,2. 


S(n-1;  x  , 


00  o  9  J 


x  -5  ,  -6  ).  The  proof  is  accomplished  by  means  of  the  following  1-1 

c.  £  tC  KL 

mapping  P  of  S  onto  T.  Let  a  €  S(n;  x^ ,x  , . . . ,x^) .  Consider  the 
vector  P(a)  obtained  as  follows: 

(1)  replace  every  element  <  k  of  a  by  zero. 

(2)  replace  every  element  a^  >  k  by  a^  -  k. 

(3)  suppress  the  first  zero  element  of  a,  leaving  a  vector  of 
n-1  components. 

It  was  shown  in  [18]  that  this  mapping  is  a  1-1  mapping  of  the  set 

axk 

f)  s(n-l;  y  ,y  , . . .  ,y.) , 

-  -  y  =0  yk=0 


'  .  •  i 


i* 
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where  the  vectors  satisfied  only  the  conditions  (l)  and  (?)  of  2.6.1.  Since 
this  mapping  preserves  inequalities,  it  is  readily  established  that  this 
mapping  P  is  also  a  1-1  mapping  of  S(n;  Xj ,xp, . . . ,x^)  onto  T. 

We  can  extend  the  results  of  Theorem  2.6.1  in  the  following  manner. 

Definition  2.6.2  Let  us  consider  the  set  of  vectors  B  =  fb:  b  =  (b. ,b  ,  .  .  .  ,b  )}, 

12  n 

where  components  b^  satisfy  the  conditions  (l)  and  (3)  of  2.6.1  and 
condition  (2)  replaced  by 

(2')  +  P  ;>  P  an  integer  such  that  0  <  p  <  k-1. 

Let  S^(n;  xi,x?’'**,xk  denote  the  subset  of  B  with  the  usual  congruence 
properties.  Similarly  to  Theorem  2.6.1,  we  can  prove  the  following  theorem. 


Theorem  2.6.2 

(n;  x  x 

v  p+1  p+2 


The 


number  of  vectors  in  S  (n;  x, ,x  , 

p  1  2 


* 


.  x^,  ,  x2  ,  ...,  X  ),  where  (n; 


V 


*  *  *  . 

.....  x  )  is  the  number  of  Theorem  2.5.2. 

p 


2 


, , x^)  is  given  by 

x  ,  x 
p+1  p+2 


2 . 7  Regular  Simple  Sampling  Plans . 

We  know  that  putting  k  =  2  in  2.6.1  yields  the  vectors  that 
correspond  to  regular  sampling  plans.  Thus,  with  the  aid  of  Theorems  2.5.1 
and  2.6.1,  we  can  enumerate  the  regular  sampling  plans. 


Let  us  define,  in  general, 
n 


2 


J  I  I  w 


: 


.. 

1  h  f JtmlZ 
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Consider  the  expansion  j"(l  +  x)n+1]  =  ("q1)  x°  +  xl  +  •••  +  (^+1^ 


It  is  readily  verified  that  the  coefficient  of  xC  in  this  expansion  is 
~  /n+l\  /n+l\  /  r  .-ik 


x,+xn+.  .  .+x.  =c 
12  k 


\xi 


vX2  / 


•  Since  f(l  +  *)n+1 


=  (1  +  x) 


k(n+l) 


then  Z  (*+1) 

Xl+X?+ ' * '+xk=C  ^  ^  '2 


)  •••  (^)=  (k(c+l))  •  Thus>  R<k)  = 

Z  (l-^rr)  •  For  k  .  2,  R(2)  -  Z^  (l  -  £)  ,  (^l)  . 

f Pn+l^ 

V  "  )  * 


c=0 


n+1 ) 


c=0  V 


Thus,  the  number  of  regular  simple  sampling  plans  of  size  n  +  1  is 


For  k  =  2,  we  can  obtain  the  same  results  by  considering  the 
set  of  vectors  =  {b:  b  =  (b  ,...,b^)}  as  defined  in  2.6.2  with  p  =  0 
and  the  set  of  vectors  as  defined  in  2,6.2  with  p  =  1.  Let  us 

partition  the  sets  B^  and  B^  into  the  sets  T^(n;i),  T^(n;i)  respectively, 
i  =  0,l,2,...,n.  Each  set  T^.(nji),  j  =  0,1,  consists  of  those  vectors 
in  Bq  and  B^  respectively  that  have  exactly  i  positive  components. 

Let  Nq,N^  represent  the  number  of  vectors  in  Bq,B^  respectively  and 
N^(n;i),  N^(n;i)  the  number  of  vectors  in  T^(n;i)  and  T^(n;i)  respectively 

It  can  be  shown  in  a  manner  similar  to  Theorems  2.5*1  and  2.6.1  that 


No(n!i)  =  Ci) 


2n 

i  -  2 


and 


Nx(n;i)  =  r 


2n  +  1 


2n  +  1 

i  -  2 


Thus,  we  can  again  enumerate  the  regular,  sampling  plans,  since  the  total 


number  of  plans  is  N  =  £  N  (n;i)  =  £ 

i=0  i=0 


M“.* ')  ■ 


It  is  also  readily  verified  that 


"?)  -  fi2-  a)  ■  6  -  C2(Y 1} 
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CHAPTER  III 


SIMPLICITY  IN  OPTIMAL  SAMPLING  PLANS 

3  •  1  Introduction 

Girshick,  Mosteller  and  Savage  in  their  fundamental  paper,  [93 > 
first  characterized  binomial  sampling  plans  in  terms  of  boundary,  continua¬ 
tion  and  inaccessible  points  in  the  plane.  In  Chapter  1,  section  1.5, 
various  characterizations  of  simple  binomial  sampling  plans  have  been 
described.  In  [4],  Blackwell  and  Girshick  gave  a  description  of  a  more 
general  sampling  plan  in  terms  of  cylinder  sets.  In  1.7,  we  described  their 
procedure  for  obtaining  optimal  sampling  plans  in  terms  of  these  cylinder 
sets.  In  general,  these  optimal  sampling  plans  in  terms  of  cylinder  sets  of 
Blackwell  and  Girshick  need  not  necessarily  yield  the  sampling  plans  that 
can  be  characterized  by  boundary,  continuation  and  inaccessible  points 
in  the  plane.  However,  in  the  cases  where  the  observations  are  independent, 
identically  distributed  Bernoulli  random  variables,  the  optimal  sampling 
plans  of  Blackwell  and  Girshick  do  yield  the  binomial  sampling  plans  as 
characterized  in  [9].  It  is  not  at  all  clear,  however,  when  the  optimal 
sampling  plans  described  by  cylinder  sets  are  simple  sampling  plans  in  the 
sense  considered  in  this  thesis.  It  is  the  purpose  of  this  chapter  to 
present  a  method  by  which  one  can  determine  whether  optimal  and  other  more 
general  sampling  plans  described  by  the  cylinder  sets  of  [4]  are  simple 
sampling  plans  in  the  sense  considered  in  this  thesis. 

Section  5.2  deals  with  several  examples  of  simple  and  non-simple 
optimal  plans.  In  the  next  section,  we  present  an  algorithm  by  which  one 
can  readily  determine  whether  the  optimal  sampling  plans  described  by  the 


.r 

' 
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cylinder  sets  of  Blackwell  and  Girshick  are  simple.  In  the  last  section, 

we  study  the  risk  functions  E.(x)  and  U,(x)  in  particular  problems  in 

1  J 

order  to  provide  some  insight  into  the  structure  of  simple  optimal  plans. 

While  it  appears  possible  to  formulate  certain  conditions  which  would  assure 
simplicity  in  rather  particular  cases,  we  are  not  able  to  give  any  conditions 
that  can  be  applied  in  general  cases  of  greater  interest. 

3.2  Examples  of  Simple  and  Non-Simple  Optimal  Plans. 

In  this  section,  we  shall  give  examples  of  situations  that  lead  to 
optimal  simple  sampling  plans  and  an  example  that  leads  to  an  optimal,  non¬ 
simple  sampling  plan.  The  fact  that  we  can  obtain  a  non-simple  optimal  plan 
indicates  the  complete  generality  of  the  formulation  of  these  plans  as  given 
by  Blackwell  and  Girshick  [4].  To  determine  these  optimal  sampling  plans  in 
particular  situations  is  a  formidable,  though  straightforward,  computational  prob¬ 
lem,  as  will  be  evidenced  by  the  examples  of  this  and  later  sections. 


Example  3.2.1  Let  N  =  3,  Cl  =  (£,  %)  A  -  (£,  |).  Let  £(£)  - 
£,(i)  -  i  be  the  ^priori  distribution.  Let  the  loss  function  be  L(-±-,  -L) 
L(i,  J)  =  50,  L(J,  £)  =  100  and  L(-J,  J)  =  0.  Let  C  (x)  be  given  by 


0, 


C  (x)  =  0  C.^(x)  =  1  if  x^ 


=  5  if  x  =  1 


C  (x)  =3  if  E  x.  =  0  or  3 
5  i=l  1 


C. 


0  C  (x)  =  7  if  Ex, 


0  or  2 


•  *a  J 

i=l 

2 

3  if  Ex, 

i=l 


=  10  if  E  x, 

•  1  ] 

1=1 


1  or  2 


for  all  x  e  X,  when  X  =  {x  :  x  =  (x, ,x  ,x 

i  a  3 


x. 


0  or  1  for  i  =  1,2,3} 


We  denote  the  value  of  the  conditional  expectation  E^(o»+1)  at  m^  by 

E . (m . ) ,  cf.  I.7.6.  Let  h.(m.)  =  U  (m  )  -  E  (m  ).  For  this  example  we 
j  j  J  J  J  J  JJ 

exhibit  the  values  of  h.(ra,).  h  (0)  =  1.9»  ^  (l)  "  ‘*75»  hp(2)  =  - *9» 

J  J  ^  ^ 
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The 


h]_(0)  =  5.2,  h^  ( 1 )  =  .7,  h^O)  *  6.2.  Since  ^^{o)  >  0,  h  (l)  <  0, 
h2(0)  >  0  and  h^O),  h.^l)  and  h  (0)  are  all  positive,  then  we  can 
actually  determine  whether  this  plan  is  simple  by  examining  the  functions 
hj(nij).  In  the  cases  of  small  N,  it  is  equally  convenient  to  determine 
the  sampling  plan  specifically  in  terms  of  boundary,  continuation  and 
inaccessible  points  in  the  plan  (as  illustrated  in  Figure  b  below). 


■u — 


Notice  that  the  cost  function  in  this 
example  is  rather  unusual  in  that  it  depends  on  x 
as  well  as  j.  Pathological  though  this  example 
may  seem,  it  is  very  possible  that  any  general  theory 
will  not  completely  rule  out  such  cases.  However, 


S  CUn*  1 1  ij  ^  f°r 

F»'9*  + 

we  shall  demonstrate  explicitly  that  Blackwell  and  Girshick  [1|]  have  proved 


the  simplicity  of  optimal  sequential  sampling  plans  in  the  general  case  of 

the  truncated  dichotomy  (with  a  linear  cost  function  and  a  similar  loss 

function).  In  most  practical  cases,  the  cost  function  usually  satisfies 

C  „(x)  >  G  (x),  n  =  0,1,...  and  it  seems  reasonable  to  conjecture  that 
n+ 1  '  —  n '  ' 

the  optimal  plans  are  simple  in  this  case. 


Example  5.2.2  We  now  give  an  example  of  the  more  practical  kind.  Let  N  =  5* 
Q  =  {w  :  0  <  w  <  1} . 

A  =  (a  :  0  <  a  <  1}  £(w)  =1  0  <  w  <  1 

=  0  otherwise. 

L(w,a)  =  k(w-a)2  k  >  0  and  C,(x)  =  j  for  all  x  e  X. 

J 

For  this  example,  the  functions  h.(m.)  «  U.(m.)  -  E  (m  )  are  symmetric  and 

j  j  J  J  J  J 

it  can  be  seen  directly  from  these  functions  that  the  optimal  plans  are  simple. 
We  have  also  determined  the  actual  plans  and  list  them  in  the  following  table 
in  terms  of  the  vector  notation  1.5. 

Table  I 

k  Sampling  Plan 

k  >  to,  0,  0,  0,  0,  0) 

5 _ 


\rlleu 
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1264 
“  5 

(1. 

0, 

0, 

0, 

0,  1) 

k 

9 

<  1 264 

-  8 

(2, 

1, 

0, 

0, 

1,  2) 

150  <  k  < 

.  1764 
■  9 

(1, 

0, 

0, 

0, 

1) 

3 

;  150 

(0, 

0, 

0, 

0). 

1 

100  <  k  < 

4oo 
■  3 

(1, 

0, 

0, 

1). 

» 

V! 

V 

OJ 

r- 

100 

(0, 

0, 

0) 

36  <  k  < 

72 

(0, 

0) 

k  <  36 

No  observations 

required . 

We  remark  that  this  example  is  a  generalization  of  an  exercise  in  Blackwell 
and  Girshick  [4]  p.  245>  where  k  =  200.  This  particular  example,  which  would 
appear  under  the  third  entry  for  k  in  Table  I  is  discussed  in  detail  at  the 
end  of  this  chapter,  where  we  hope  to  provide  more  information  about  how  the 
nature  of  hj(iru)  determines  the  simplicity  of  the  plan. 


3 • 3  A  Characterization  of  Sampling  Plans  in  Terms  of  Basic  Cylinder  Sets . 

Blackwell  and  Girshick  [4],  Chapter  3,  have  defined  a  sequential 
sampling  plan  as  a  partition  ^3  =  (S  , , . . . ,S^)  of  the  sample  space  X, 
such  that  each  is  a  cylinder  set  over  K=[reJ;0<r<j],  J= 

{0,1,2, ... ,N} .  For  the  sample  space  X  --  [x:x  =  (x^,x^, . . . ,x^) ,  x^  =  0 
or  1,  i  =  1,,.,,N]  and  independent,  identically  distributed  observations 
x^,  it  is  difficult  to  determine  whether  the  optimal  sampling  plans  are 
simple  or  not,  as  seen  by  the  examples  in  3*2.  In  this  section,  we  shall 
consider  a  sampling  plan  as  a  partition  of  the  sample  space  by  means  of  the 
basic  cylinder  sets,  which  have  already  been  defined  by  Blackwell  and  Girshick  [4] 
P.  239  and  we  shall  give  a  method  by  which  one  can  determine  whether  an  optimal 
plan  is  simple  or  not.  All  of  our  definitions  apply  only  to  the  sample  space 


X  given  above. 


Definition  5.5.1  For  each  x  e  X,  let  F^(x)  be  the  set  of  all  points 

y  e  X  such  that  y  €  F  (x)  if,  and  only  if,  y„  =  x.  for  i  =  1,2,.. .,j. 

J  XI 

The  sets  F^(x)  are  called  basic  cylinder  sets.  In  other  words,  given  each 

x  e  X,  the  sets  F.(x)  contain  all  those  points  in  X  that  have  the  same 

J 

o 

first  j  coordinates  as  x.  Since  for  each  j,  there  are  2?  possible 
different  first  coordinates,  there  are  2^  different  sets  F^(x), 

Definition  5.5.2  For  a  fixed  j,  let  B'^  be  the  collection  of  all  sets  F^(x). 

Since  there  are  2^  possible  sets  in  B”^,  we  can  label  these  sets 

,  Bp . BJ,.  Let  .  (irtg . ir),  1  <  i,  <  ip  <  ...  <  ir  £  SJ  be 

a  subset  of  the  set  of  integers  (1,2, , . , ,2J) ,  As  we  have  already  seen, 

Blackwell  and  Girshick  define  a  sequential  sampling  plan  as  a  partition 
r 

=  ( S^, , , . . , S^)  of  the  sample  space  X,  where  each  S^  is  a  cylinder 


U 


i  e  I 


j  *Ji- 


set  over  the  set  K  =  (r  eS  =  0  <  r  <  j ) .  Since  each  S^.  = 
it  is  readily  seen  that  we  can  express  a  sequential  sampling  plan  equivalently 
in  terms  of  the  basic  cylinder  sets  of  definition  J.J.l.  Thus,  we  find  it 
convenient  to  consider  a  sequential  sampling  plan  as  a  partition  of  the  sample 
space  X  into  the  basic  cylinder  sets  of  definition  3*3«1* 

There  are  2^!  ways  of  labelling  the  sets  in  B~® .  We  shall  choose 
a  particular  labelling  in  order  to  readily  identify  the  particular  sets  B^. 
Let  us  label  the  sets  in  B^,  j  =  0,1,2, ... ,N  in  the  following  manner. 


Definition  ^.5.5 

of  2  sets 

B 


B^  consists  of  one  set,  which  we  label  B^,  consists 


^  =  [x  e  X  :  x ^  —  0 } 


B^  =  (x  e  X  :  x^  =  1} 


i  i 

We  label  the  sets  Bj^  inductively.  Suppose  the  set  B^  has  already  been 


labelled,  j  <  N.  Clearly 


-  64  - 


J 


=  {x  €  X  :  x  €  Bj|  and  x^+1  =  0)  (J  (x  £  X  :  x  €  Bjj  and  x_.+1  =  l) 


i  1  i+l 

i.e.  we  partition  each  B^  e  BJ  into  2  sets,  yielding  a  total  of  2J 

i+1 

sets.  We  label  every  set  with  x,  ,  =  0  as  BJ  ,  where  r  is  to  be 

j+1  r 

i+1 

determined,  and  similarly  every  set  with  xj+^  =  /  as  B^  where  s 
is  to  be  determined.  To  determine  r,  s  for  a  particular  B^,  we 
proceed  as  follows: 

If  k  =  1,  r  =  1  and  s  =  2.  If  k  >  1,  k  must  lie  in  one  of 


the  intervals 


i-1 


u=0 


u 


IQ. 


u=0 


where  i  =  1,2, . . . , j . 


Then,  for 


i-1 


*  ■ •  a  ©  •  z  ©i 


u=0 


u=0 


r  ss  k  + 


i-1 


u=0 


j 


and  s  -  k  + 


T 


u=0 


With  this  labelling  the  sets  can  be  divided  into  consecutive 


groups,  each  group  consisting  of  sets,  i  =  0,l,2,...,j  and  such 

that  if  x  e  Bp  for  any  k  e  (  Z  i  j  »  £  (  ^ 

k  Vu=0  W  u=0  VU> 


j 

then  Z  x  =  i. 


v=l 

Despite  the  apparent  notational  complexity,  we  illustrate  the  simple 
idea  behind  this  labelling  with  the  following  example.  We  indicate  the 


first  j  coordinate  of  x  underneath  each  set. 
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B 


0 


Bl 

00 


b: 


b; 


o 


2  2 
B2  B3 
01  10 


B 


4 

11 


b? 

B? 

B5 

Bp 

BP 

BP 

B^ 

Bo 

1 

2 

3 

4 

5 

6 

7 

8 

000 

001 

010 

100 

011 

101 

110 

111 

q4 

4 

04 

„4 

4 

_  4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

B1 

B2 

3 

B4 

5 

B6 

B 

7 

B8 

9 

B10 

Bli 

B12 

13 

Bl4 

B15 

B16 

0000 

0001 

0010 

0100 

1000 

0011 

0101 

1001 

0110 

1010 

1100 

0111 

1011 

1101 

1110 

1111 

From 

this 

example,  we  see 

that 

the 

sets 

are 

labelled  so 

that 

the 

sequences  with  £  x  =  m.  =  0  appear  first,  then  those  with  m.  =  1  and 

■f__i  J  J 

1-1  44  4 

so  forth.  For  example,  in  order  to  label  the  sets  B^,  B^,  ...,  B^,  we 

3  3  3 

first  took  the  3  sequences  corresponding  to  B^,  B,,  B,  ,  attached  a 

2  3  4 

4  4  4 

zero  to  each  sequence  and  labelled  them  B^,  ,  B,_;  then  we  attached 

4  4  4  5 

a  one  to  each  sequence  and  labelled  these  sets  as  B^,  B  ,  B^.  Thus,  in  B  , 


=  {x  e  X  :  x^  = 
=  {x  e  X  :  x.  = 


x2  =  XJ  =  x4  =  x5  = 

x2  =  x3  =  *k  =  0  x5  =  ^ 


B^  =  (x  €  X  :  x,  =  x.  =  x,  =  x,  =1  x_  =  0}  and  so  forth, 

3  1  1  2  3  4*5 


In  general,  the  sequences  with  the  fewest  number  of  ones  are  labelled 
first J  sequences  with  the  same  number  of  ones  are  adjacent  and  can  be  shown 
to  follow  a  "dictionary  order”,  although  we  shall  not  establish  the  details. 
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For  each  N,  we  form  the  triangular  array  in  the  following 

manner.  consists  of  N  +  1  rows,  the  rows  being  numbered,  for  the  sake 

of  convenience,  from  0  to  N.  The  row,  j  =  0,1,2,...,N,  consists 

of  2?  terms,  where  the  term  in  the  I1"*1  row  is  the  B^  of 

j  k 

definition  3*3*3*  In  other  words,  the  array  B^  consists  of  all  the  basic 
cylinder  sets  for  the  sample  space  X  as  ordered  by  definition  3* 3*3* 

O 

Given  any  sampling  plan  we  can  represent  it  as  an  array  B^  defined 

below. 


Definition  3.3.5 


Let  the  sampling  plan  ^  be  given.  Let  B^  be  the 


triangular  array  (B.,}  j  =  0,1,2,..,,N,  k  =  1,2,..., 2?  formed  from  the 

j 

triangular  array  B^  as  follows: 

(1)  retain  all  the  B^  given  by  the  sampling  plan  ^  ,  i.e. 

V  =  Bk  for  Bii £  S  • 

(2)  for  each  B,^  e  ^  retain  all  the  sets  Br  such  that 

k  t 


B 


k  =  0  r  =  J+l.  J+2,  ...»  N. 


(3)  replace  the  remaining  B^  by  zero. 


We  have,  thus,  represented  a  sampling  plan 


as  an  array  B 


N  } 


i  t  Vi 

consisting  of  N  rows,  with  2J  terms  in  the  j  row.  Each  term  in 

j 

the  array  is  either  a  zero  or  an  entry  B^.  With  the  usual  definitions  of 
equality  of  sampling  plans  and  arrays,  it  is  readily  verified  that  two 


different  sampling  plans  result  in  two  different  arrays  B^ .  Not  every 

array  B^  of  zeroes  and  sets  B^  will  represent  a  sampling  plan.  Obviously, 

* 

one  necessary  but  not  sufficient  condition  is  that  with  every  B^  retained 

r  1  I  X* 

we  also  retain  the  sets  B  ,  r  =  j+1,  j+2,  ...»  N  such  that  BJ  B  . 

L  K  t 
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As  an  example  of  this  procedure  consider  the  sampling  plan  of 
size  5,  =  b5,  b|,  b£,  b^,  Bg,  b^,  b5,  B5,  Bjj,  B5,  B5,  b5,  b|,  b^,  b^J 


The  array  B_  is 

5 


0 


0 


0 


3.3.1 


0  0  0  Bf 

"4 

0  0  0  0  B^  B^  B^ 

“3  -o 


B 


8 

4  4  4  4-4  k  )t 

OOOOOB^B^BgB^  Bl0  B^ 


B 


B 


4 


b; 


,5 


K  %  ^ 


4 


b; 


-6  4  4  4  4o 


B 


5 

11 


B 


12  * 


B. 


13  14  15  16 


B 


32 


The  underlined  entries  are  given  by  (rule  1  of  definition  3*3*5)  end 
the  sets  not  underlined  are  obtained  from  B^  e  3  according  to  rule  (2)  of 

3*3*5* 


A  sampling  plan  ,  given  in  terms  of  the  basic  cylinder  sets  above, 

need  not  represent  a  plan  of  the  type  that  can  be  described  by  the  boundary, 

continuation  and  inaccessible  points  of  Girshick,  Mosteller  and  Savage.  To 
avoid  tedious  writing,  we  shall  refer  to  a  sampling  plan  3  given  in  terms 

of  basic  cylinder  sets  as  a  BG  plan  and  a  sampling  plan  that  can  be 

described  in  terms  of  boundary,  continuation  and  inaccessible  points  as  a 
GSM  plan.  Every  GSM  plan  can  be  expressed  as  a  BG  plan,  but  not  every 
BG  plan  is  a  GSM  plan.  A  relatively  simple  solution  to  the  problem  of 
determining  when  a  BG  plan  is  a  GSM  plan  can  be  given  as  follows. 


Definition  3.3.6 


Let  S  be  a  BG  plan.  Let  the  array  B^  be  given.  For 


each  j  =  0,1,2, ...,N,  =  0,1,2, ...,j,  let  C  (m.)  be  the  sequence 


m .  -1 
J 


B,  «  B  .  1  «  .00,  B . 

jr  j ,r+l  j,r  + 


-I  © 


vy 


where 


+  1. 


u=o 


' 
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Let  the  sampling  plan 


be  given.  Define  the  array 


Definition  5«3«7 

B^  to  be  the  triangular  array  obtained  from  B^  by  applying  only  rules  (l) 
and  (3)  of  definition  3.3.5  i.e.  B.v  =  Bjj  if  e  1  )  ;  B.k  =  0  otherwise. 


jk 

.0 


jk 


For  example,  the  array  B5  corresponding  to  the  plan  S  given 


by  equation  3.3. 1  is 


3.3.2 


0 

0  0 

0  0  0  B? 

4 

0  0  0  0  B^  0  0 

5  6 

4  4  4 

00000  B  BC  00000000 

0(0 


B1  4  4  BU  B5  B6  B7  B8  B9  B10  0  0 


.  0 


Now  consider  the  sequence  C.(m.),  (m.  ^  0,  j).  One  of  the 

J  J  J 

following  three  possibilities  may  occur: 

(1)  B,  =  0  for  all  B.  e  C,  (m.) 

jr  jr  j  3 

(2)  B.  =  BJ  for  all  B.  e  Cs  (m.) 

'  jr  r  j  J 

(3)  B.  =0  for  at  least  one  B.  e  C.  (m.)  and 

jr  jr  JJ 

B.„  =  B^  for  at  least  one  B.„  €  C.  (m.). 

S  J  J 


We  remark  that  if  m^  =  0  or  j  case  (3)  cannot  occur. 


Let  us  consider  a  BG  plan  represented  by  the  array  B  .  By 
making  the  usual  identification  with  lattice  paths  in  the  plane,  we  see 
that  in  case  (l)  every  path  to  the  point  (rru,  j  ”  m j )  exists  and  continues 
at  least  one  more  step.  Thus,  this  point  can  be  identified  as  a  "continuation” 
point,  since  in  either  the  BG  or  GSM  case  we  continue  taking  observations 
(sampling).  Similarly,  in  case  (2)  we  can  identify  the  point  (ttu  ,  j  -  m.) 
as  a  "boundary"  or  "inaccessible"  point,  since  we  either  continue  taking 


. 
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observations  until  we  reach  this  point  and  stop  or  else  we  have  stopped 

sampling  before.  In  case  (3),  we  notice  the  different  situations  that  may 

arise.  Since,  there  is  at  least  one  zero  in  this  sequence,  this  implies 

that  there  is  at  least  one  sequence  of  observations  extending  beyond  this 

point  and  thus^  this  point  is  a  "continuation"  point.  However,  since 

there  is  at  least  one  non-zero  element,  it  is  possible  that  the  sampling 

plan  S  insist  that  we  take  observations  until  we  reach  this  point  and 

stop.  Thus,  the  point  (m.,  j  -  m,)  could  be  both  a  "boundary"  and 

J  J 

"continuation"  point.  In  a  GSM  plan  such  an  anomalous  situation  cannot 
occur . 


The  following  is  an  example  of  a  BG  plan  that  is  not  a  GSM 
plan.  ^  =  (B^,  B^ ,  B^,  B^,  B^) .  The  array  is 


0 


3-3.3 


B: 


B; 


B 


0 


B 


B 


0 

0 

B; 


B 


k 


B 


B: 


6  “7 


B 


8 


The  array  B^  is 


0 


3.3.4 


0 


B 


0 

0 


B 


B: 


B 


3 


0 


r3 

B4 


0  B 


0  0 


This  sampling  plan  tells  us  to  take  2  observations  and  stop  if  we  observe 
01  or  11,  otherwise  take  another  observation  and  stop.  The  following 
theorem  presents  a  criterion  for  determining  when  a  BG  plan  is  a  GSM  plan. 
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Theorem  3.3.1  Let  the  sampling  plan  ^  be  given.  Let  and 


B 


0 

N 


,0, 


be  the  arrays  corresponding  to  S.  Let  C;(m.)  be  the  sequence  in  the 

•3  J 

array  B^  corresponding  to  the  sequence  C^(nu)  in  B^.  A  BG  plan  is 

a  GSM  plan  if,  and  only  if,  for  every  sequence  C.(m„)  that  has  at 

J  *3 

least  one  zero  the  corresponding  sequence  C^(nu)  has  all  of  its  terms  zero. 

Proof:  Suppose  a  BG  plan  is  a  GSM  plan.  From  the  discussion  preceding, 

it  follows  that  every  point  (m  ,  j  -  m.)  corresponding  to  every  sequence 

J  J 

Cj(nu)  with  at  least  one  zero  is  a  continuation  point.  In  fact,  any 

non-zero  terms  in  C^(m^)  only  appear  in  B^  by  applying  rule  (2)  of 

definition  3.3*5.  But  this  implies  that  all  the  terms  in  C?(m.)  are  zero. 

J  J 

Conversely,  suppose  to  every  sequence  C^(m^)  with  at  least  one 
zero  there  corresponds  a  sequence  C^(m^)  that  has  all  its  terms  zero. 

Then,  any  non-zero  terms  appearing  in  C^m^)  appear  because  of  applying 
rule  (2)  of  definition  3 * 3 • 5 •  Thus,  the  point  (iru,  j  -  nu )  is  a  continua¬ 
tion  point  and  the  plan  is  a  GSM  plan.  Thus,  the  theorem  is  proved. 

Comparing  the  arrays  B5  and  B°  given  by  equations  3-3.1  and 
3.3*2,  we  see  that  to  every  sequence  Cj(rru)  in  B^  with  at  least  one  zero, 
the  corresponding  sequence  C?(m^,)  has  all  its  entries  zero.  A  similar 
comparison  of  equations  3*3*3  anc*  3*3*^  stows  that  C^(l)  corresponding  to 
C0(l)  does  not  have  all  its  entries  zero. 

C. 

Theorem  3*3*1  provides  a  criterion  for  determining  when  a  BG 
plan  is  a  GSM  plan.  Once  we  determine  that  a  plan  is  a  GSM  plan, 
we  can  then  determine  whether  it  is  simple  or  not.  From  now  on,  we 


assume  that  the  sampling  plan  S  is  a  GSM  plan,  since  we  can  always 


determine  this  fact. 


-  a  J 

* 
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Definition  3.3.8  Let  the  array  B  be  given.  From  the  array  we 


* 


form  the  reduced  array  B_T  as  follows: 

N 


for  each  j  =  0,1,2,...,N  and  m^  =  0,l,2,...,j 
replace  the  sequence  C.(m.)  by  (B^  )'  if  at  least  one  zero  appears  in 

]  i  mi  * 

the  sequence  and  by  (B^  )  otherwise.  The  reduced  array  consists  of 


,  th 


m. 

J 


N  +  1  rows,  the  j  row  consisting  of  j  +  1  terms,  each  term  being 
distinguished  by  a  prime  or  a  star.  From  the  discussion  preceding  Theorem 
3.3* the  following  lemma  follows. 


Lemma  3.3.1 


The  point  (mj>  j  "  m j )  is  a  boundary  point  or  inaccessible 


point  (continuation  point)  if,  and  only  if  (BJ  )  [(BJ  )‘]  appears  in  the 

mj  mj 

reduced  array.  As  an  example  of  this  procedure  consider  the  reduced  array 
* 

corresponding  to  the  array  B,_  given  by  equation  3.3. 1 


(B°)' 

(Bp)'  (b|)' 

(Bq)'  (Bj)'  (b|)* 

(Bq)'  (b3)-  (b|)*  (b|)* 

(Bq)’  (Bg)'  (Bg)*  (B*)*  (B y 

(B 5)*  (B5)*  (b|)*  (b|)*  (>5)*  (b|)*. 


This  array  represents  the  sampling  plan  illustrated  in  Figure  6".  In  terms 
of  the  vector  notation,  this  is  the  plan  (0,  0,  0,  1,  2,  3) 

In  the  reduced  array,  the  primed  sets  appear  in  runs,  in  the 
usual  sense,  in  each  row.  Thus,  we  can  use  the  reduced  array  to  determine 


whether  a  GSM 


sampling  plan 


The  Sampling  Plan  for  3*5*1 


72 


is  simple. 

Theorem  5.5.2  The  GSM  sampling 
plan  is  simple  if,  and  only  if,  the 
maximum  number  of  runs  of  primed 
sets  in  each  row  is  one. 


Figure  5.  Proof;  S  is  simple  if,  and  only  if, 

there  are  no  boundary  or  inaccessible 

points  between  the  continuation  points  on  the  line  x  +  y  =  j,  j  =  1,2,..., N- 
Because  of  lemma  3. 3. l^for  each  j  we  can  identify  the  non-continuation  and 
continuation  points  with  the  starred  and  primed  sets  respectively.  Thus, 
the  above  statement  is  equivalent  to  the  theorem. 


As  an  example  of  a  non -simple  GSM  plan  consider 
n 


s 


-  By  b2’  b8>  bi>  “a*  Bh>  BiV- 


4  „4 


B.  is 

4 


0  B; 


B.  B. 


B 


0 


B 


B 


4 


0 


B. 


B 


3 


0 


b6  0 


B 


8 


B 


4 

16 


The  reduced  array  is 
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(B°r 

(Bq)'  (bJ)' 

(B^) '  (B®)*  t  Bp) ' 

(B^)'  (B^)*  (B^)1  (B^)* 

(Bg)  (B^)  (Bp  (Bp  (Bp  . 


This  sampling  plan  is  not  simple  because  in  rows  2  and  3  there  is  more  than 
one  run  of  primed  sets.  This  plan  is  illustrated  in  Figure  6> 


We  finally  remark  that  we  can  determine  whether  a  plan  is  a 

GSM  plan  by  comparing  the  array 


directly  with  the  sampling  plan 


^ - 

t  .  ^ 

i 

'  ; 

\ 

t 

c 

> 

i - >4 

F igure  6 , 


S  *  without  explicitly  introducing 
the  array  B^.  However,  for  sake  of 
exposition  and  clarity,  we  found  it 
necessary  to  define  the  arrays  B^, 
Let  ^  be  a  BG  plan  given  in 


■? 

terms  of  the  sets  B:r.  Let  B  be  the  array  corresponding  to  S,  We 

K  IN 

examine  all  the  sequences  C^(m^)  in  that  contain  at  least  one  zero. 

If,  in  any  such  sequence  C^(m^)  any  terms  B„^  apear  that  are  given 
directly  by  S,  the  sampling  plan  is  not  a  GSM  plan.  In  a  similar  way, 


we  can  easily  determine  whether  a  GSM  plan  is  simple  directly  from  the 

* 


array  without  explicitly  introducing  the  arrays  B^,  although  we 

shall  not  go  into  the  details  here.  Thus,  given  a  sampling  plan  5  in  terms 
of  the  cylinder  sets  of  Blackwell  and  Girshick,  we  can  determine  whether  it 
is  a  plan  that  can  be  described  by  the  boundary,  continuation  and  inaccessible 
points  of  Girshick,  Mosteller  and  Savage  and  then  decide  whether  the  plan 


is  simple. 


3 • ^  Optimal  Sampling  Plans  and  Simplicity . 


From  the  discussion  of  optimal  sampling  plans  in  1,7*  we  see  that 
an  optimal  plan  is  determined  by  the  risk  functions  and  for  each 

j  and  m. .  Thus,  it  is  reasonable  to  expect  that  the  property  of  simplicity 
of  optimal  sampling  plans  will,  in  some  way,  depend  upon  the  behaviour  of 
these  risk  functions. 

We  now  restate  the  condition  for  simplicity  of  optimal  sampling 

plans  in  terms  of  the  risk  functions  E,  and  U„. 

J  J 


Definition  5.4.1 

Definition  3 ,4.2 


For  each  j  -  0, 1 ,2,  . .  .  »N-1  we  let  h.(m„)  =*  U.(m.)  - 

J  J  J  J 


Let  ml  be  the  least  m.  for  which  h.(m„)  >  0  and 
J  J  j  j 


m"  be  the  greatest  m.  for  which  h,(m.)  >  0,  0  <  in,  <  j.  We  remark 

j  J  J  J  J 

that  for  some  j,  ml  and  m"  may  be  equal  or  h  (m.)  <  0  for  all  m., 

J  J  J  J  J 

so  that  ml  and  m'1  are  undefined. 

J  J 

From  lemmas  I.7.I  and  1.7*2,  it  follows  that  for  a  boundary  or 

inaccessible  point,  h.(m.)  <  0  and  for  a  continuation  point  or  inaccessible 

J  J 

point,  lu(m  .)  >  0. 

Thus,  the  condition  for  simplicity  can  be  stated  as  follows. 


Lemma  ^.k .1 


If  for  every  j  =  0, 1,2, . . . ,N-1 ,  h,(m  )  >  0  for  all  m 

j  J  J 


such  that  ml.  <  nu  <  m" ,  then  the  optimal  plan  is  simple, 


We  mention  2  classes  of  functions  that  satisfy  the  conditions  of 


lemma  3  •  •  1  ' 


Condition  (l)  h,(m.) 

J  J 

Condition  (2)  h.(m,) 

J  J 


is  a  monotonic  function  of  m, 


is  monotonic  increasing  for  <  x, 


and 
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monotonic  decreasing  for  riu  >  x,  for  some  x  such  that  0  <  x  <  j  „ 
Unfortunately,  the  determination  of  the  functions  h  (m  )  a  comP^icate<^ 

computational  procedure.  Also  the  determination  of  these  functions  is 

<7 

essentially  equivalent  to  determining  the  sampling  plan.  A  more  meaningful 

question  to  ask  would  be  what  restrictions  or  conditions  must  we  place  on 

the  loss  functions,  cost  functions,  apriori  distributions  and  so  on,  to 

guarantee  that  the  functions  h,(m.)  will  satisfy  the  conditions  for 

«J  J 

simplicity  or  more  directly,  what  classes  of  loss  functions,  cost  functions, 
apriori  distributions  will  result  in  simple  sampling  plans.  Although  we 
are  unable  to  give  such  general  conditions,  we  do  give  one  class  of  situation 
which  always  yield  simple  optimal  plans.  This  class  consists  of  the  trun¬ 
cated  sequential  dichotomy  considered  in  Blackwell  and  Girshick  [4]  and  the 
proof  of  the  simplicity  of  the  optimal  plans  for  this  case  follox^s  directly 
from  their  results  as  shown  in  Example  3.4.1  below.  We  also  give  a  numerical 
example,  where  it  can  be  verified  that  the  functions  h.(m^)  satisfy  the 
conditions  for  simplicity  without  actually  computing  all  these  functions. 

We  conjecture  that  in  most  of  the  usual  ''practical"  problems,  the  optimal 
sampling  plan  is  simple. 

Example  3*4.1  The  Truncated  Sequential  Dichotomy.  Let 

a  =  (w1,  w2)  w  w2,  A  =  (a^  a,2). 

We  assume  0  <  w1  <1  and  0  <  w  <  1.  Let  j*(w,  )  =  I*  and  j*(w  )  =  1  -  j- 

I  t c  "II  d  i 

0  <  h  <  ls  be  che  aprlorl  distribution  —  Let  cj«  = 

j  =  0,1,2,...,N.  L(w,  a)  is  given  by  the  following  matrix. 


w. 


w„ 


21 


12 

0 


where  b1r>  >  0  and  b  >  0. 

2± 


v 


* 
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Then  £ ' ( m . ) 
J  J 


conditional  probability  of  w1  given  x.,x 

•  -L  JL  d. 

m  ?  j  -m , 


?1  J  (1  -  wt) 


m 

J 


J-m 


?1  wi  J  (1  “  wx)  j  +  (1  -  §x)  w2  J  (1  -  w2)  j 


m 


j 


j-m, 


?j2  ^  =  1 '  tyV* 


In  3*2,  we  have  given  an  example  of  a  non-simple  optimal  plan  with 
a  similar  loss  function  but  a  non-linear  cost  function.  It  follows  readily 
from  the  following  results  in  Blackwell  and  Girshick  [U]  p.  2 63  that  with  a 
linear  cost  function,  we  always  obtain  a  simple,  optimal  plan. 

Theorem  3A.1  (Blackwell  and  Girshick)  There  exist  numbers  7^  and 
j  =  0,1,2,  .  .  .  ,N  such  that  if 

(1)  .  <  £'  <  1  we  choose  a,  without  further  observations. 

N-J  -  ~  1 

(2)  0  <  £’.  <  7„  .  we  choose  a^  without  further  observations. 

~  j  —  N-j  2 

(3)  7*t  .<£'.<  5>T  .  we  take  another  observation. 

N-j  sj  N-j 

The  important  fact  that  enables  us  to  show  simplicity  is  that 
£l(nij)  is  a  monotonic  function  of  rru ,  (for  a  fixed  j). 


(m.) 
J  Y 


. . 1  ~  *1  z1  -  w2\j  rd  ' 

?!  V1  -  vx)  Vwitl  -  W  ) ) 


r  W. 


1  +  K 


>0-  "  w  )-.m 

J 


w1(!  -  w  ) 


W2(1 


Wi  ) 


W1(l  - " w2 ] 


<  1  and  £’(m.)  is  increasing  with  m. 


J  J 


If 


If  w2  <  wx, 


then 
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w..  <  w  ,  then  £'(m.)  is  decreasing  with  m  . 

1  ^  3  J  j 

We  now  assume  that  £*  (m.)  is  increasing  with  m„.  A  similar 

J  J  3 

proof  holds  if  ^’(m„)  is  decreasing  with  m„.  We  can  also  assume  that 

J  <3  J 

£  €  (7^,  5^) ,  for  otherwise  we  would  take  no  observations  and  the  plan  is 

trivially  simple.  Consider  £j(o)  and  l-j(l).  Either 

(a)  ^(0)  e  [0,  7n-1] 

(b)  ||(0)  €  [7n-1>  &n_1]  or 

(c)  £[(0)  e  [&N_1,  1].  Since  £j(l)  >  ^(0),  then  £[(l) 

can  lie  in  one  of  the  following  3  intervals  corresponding  to  the  three 
cases  (a),  (b)  and  (c) 

(a')  q(l)  €  [0,1] 

(b-)  q(i)  €  (rK_1.U 
(c-)  qd)  e  tsN1,i]. 

Thus,  either  the  points  (l,o)  and  (0,l)  are  both  boundary  points  or 
both  continuation  points  or  one.  is  a  boundary  point  and  one  a  continuation 
point.  We  now  assume  that  we  have  continued  sampling  to  j  =  k  observations, 
that  all  the  points  on  the  line  x  +  y  =  r,  r  <  k  satisfy  the.  conditions 
for  simplicity  and  that  the  points  (m^,  k  -  m^)  ...  (m^  +  h,  k  -  -  h.) 

h  <  k  -  m^  are  continuation  points  and  all  the  other  points  on  the  line 
x  +  y  =  k  are  either  boundary  points  or  inaccessible.  Then  since 
^k+ 1  ( mk  +  k  +  l)  >  according  to  Theorem  3. 4.1  only  the  following 

cases  can  arise. 


^k+l^^  €  ^N-k-1*  6N-k-l^  and  ^k+l(mk  +  h  +  l)  e  5N-k-l J 

(2^  ^k+lK>  6  (7N-k-l’  5N-k-l>  and  ^k+lK  +  h  +  "  [6N-k-l’ 


1]. 
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^  ^k+l^™^  G  7N-k-l^  and  ^k+l^S:  +  h  +  ^  6  ^N-k-l*  &N-k-l; 

^J|)  ^k+l^™!^  G  ^°s  7N-k-l^  and  ^k+l^111^  +  b  +  l)  €  ^5N-k-ls  A  ^ ' 

^  ^k+l^K)  €  ^°*  yN-k-l^  and  ^k+l^  +  b  +  ])  €  t°s  7N-k-l.-^ 

^  ^k+l^mk^  €  ^N-k-1’  ^  and  ^k+lWc  +  h  +  X)  €  ^5N-k-r  ^  * 


(Note  that  these  6  cases  are  exactly  analagous  to  the  cases  (a),  (b),  (c), 

(a')s  (bs),  (c1)  considered  simultaneously ,  where  naturally  m£  =  0  and  h=  0). 
We  remark  that  if  there  exist  any  points  for  which  ^  and 

»y  ,i  >  m^  +  h  +  1,  then  these  could  only  be  boundary  or  inaccessible  points,, 
Since  57  ,,  (m  )  is  an  increasing  function  of  nt  ,  we  see  that  in  the 

K+ 1  IC+  L  K+  i 

first  4  cases  all  the  continuation  points  on  the  line  x  +  y  =  k  +  1  form 

an  interval.  For  example,  in  case  3,  since  ^]+^T!1k^  G  7N  k  1  ^  and 

5'  ,  (m,'  +  h  +  l)  e  (/...  ,  , ,  5„  ,  ..),  there  is  a  first  value  m"  ,  such 

bk+l  k  '  N-k-1  N-k-i'  k+1 

that  <  m"+1  £  +  h  +  1  and  ?,'  +  1(>"”+]  )  e  Vk-ll'  Th"s. 

all  the  points  such  that  m,  ..  <  m"  and  m„  ,  >  m*1  +  h  +  1  are  boundary 

1  k+1  k  k+1  k 

or  inaccessible  points,  while  all  the  points  such  that  1  £  :\+]  —  tn'1  **  ^  +  1 

are  continuation  points.  Similarly  in  case  (5)  and  (6)  all  the  points  on 
the  line  x  +  y  =  k  +  1  are  boundary  or  inaccessible  points.  Thus,  the 
sequential  dichotomy  yields  a  simple  sampling  plan. 

Example  5° 4. 2  We  illustrate  with  a  particular  numerical  example  that  we 

can  establish  that  h  (m  )  satisfies  the  conditions  of  Lemma  3°4.1  without 

J  J 

actually  calculating  h.(m„)  for  all  j.  In  this  case  also  the  functions 

j  3 

h.(m„)  satisfy  condition  (2)  of  Lemma  J.k.l, 

3  J 


Let  fl  =  (w,  0  <  w  <  l) 


A 


a  :  0 


<  l) 


) 


Let  5(w)  =  1>  c^(x)  -  j  and  L(w,a)  .=  k(w  -  a)2  k  >  0, 


- 
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Then  (^(w) 


,1  -  .I1"! 

f' 

I 

'  0 


m . 


w  ^  ( 1  -*  w)  J  dw 


j-m. 
j 


From  definition  1.7. 8 


3.4.2 


(1)  u.(m.) 


j  +  k  *  min 
a 


r\ 


(w  -  a)2  £_,(w)  ^w 


=  J  + 


k{l  +  tiu)(  j  +  1  -  m.  j 
(j  +  2)2  (j  +  3) 


We  denote  the  particular  value  of  the  conditional  expectation  E^(U^^)  by 
E7  (mj)  ar*d  similarly,  particular  values  of  E.(a^+^)  by  E^(rru)  (c.f. 

1.7  .6  P.  25).  Then 


3.4.2 


(2)  E  U(m  )  =  j  +  1  +  - - - - 

J  J  (j  +  3)2  (J  +  M 


(1  +  m.)(j  +  1  “  m.)  + 


(j  -  2m.)  E.(kj+1)  -  Ej(K|+1)} 


=  j  +  1  + 


— - — - - (1  +  m  )(j  +  1  -  m  ), 

(j  +  2)(j  +  3)2  J  J 


m . 


where  we  have  used  the  fact  that  E.(x.  ,)  =  E,(xf  .)  =  T^+~'~r  .  Now,  by 

J  J+l  j  J+l  J  +  2 

definition  h.(m.)  =  U,(m,)  -  E.(m„)  and  we  let  g  (m  )  =  U.(m.)  -  E  ^m.). 

j  j  j  j  J  J  3  3  3  3  3  3 

It  can  be  easily  shown  from  3*4.2  (l)  and  3*4.2  (2)  that 


gjCmj)  =  -i  + 


- 5 - +  m  )  ( j  +  1  -  m  ). 

(j  +  2)s(j  +  3)2  J  J 


Since  E.(a.  .)  =  E.  [min  (U,  1  ,  E  „  )],  then  it  follows  that 

j  j+l  J  j+l  j+l 

h . (m . )  >  g , (m , )  for  all  j  and  m.. 

J  J  J  j  J 
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We  now  consider  the  case  when  N  =  5  and  k  =  200  to  illustrate 
how  these  results  can  be  used  to  establish  that  this  plan  is  simple:  (cf.  3«2) 


Now  h,  (in.)  =  g,  (m.).  For  k  = 
4  j  4  j 


200. 


(2)  =  -  1  +  >  0  and  gJ+(  1) 


62  •  72 


- 1  +  2°2i2im<0, 
62  •  7 


Since  g^(m^)  is  symmetric  and  monotonically  increasing  for  iru  <  j  / 2 ,  we 

can  conclude  that  g,^(o)  ^  0  and  8^(3)  <  0,  g^(4)  <  0.  Similarly,  we 

can  establish  that  g  (o)  <  0,  g  ( 1 )  >  0,  gx(2)  >  0,  g^(3)  <  0,  and 

✓  ✓  ✓  ✓ 

g2(nO  >  0  for  mp  =  0,1,2,  )  >  0  for  =0,1  and  g  (o)  >  0. 

Thus,  h^(m^)  >  0  for  m^  =  2,  h^(m^)  <  0  for  =  0,1, 3>4.  Similarly, 

since  h.(m. )  >  g.(m„),  h_(m_)  >  0  for  m_  =  1,2,  h_(in  )  <  0  for 

J  J  “  J  J  33  3  33 

m_  =  0,3s  lu(m^)  >  0  for  all  mrt,  h,  (nr )  >  0  for  all  m,  .  Thus,  h.(m.) 
3  ’ v  *  2  2 '  2  11  1  J  J 

satisfies  condition  (2)  of  Lemma  3.4.1  and  the  plan  is  simple. 


Although  a  similar  procedure  could  be  carried  out  for  any  N 
and  k,  a  detailed  description  taking  care  of  all.  the  possible  cases  that 
could  arise  is  formidable  computational  problem.  In  all  the  numerical 
examples  we  have  attempted,  all  the  optimal  plans  were  simple  and  the 
functions  h ^ (m . )  satisfied  condition  (2)  of  Lemma  3»^»1»  and  were  symmetric 
about  the  point  x  =  j/2,  supporting  the  conjecture  that  all  optimal  plans 
for  this  situation  are  simple.  However,  our  methods  are  not  powerful  enough 
to  establish  this  conjecture. 
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